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METHODS AND COMPOSITIONS RELATED TO ARGONAUTE PROTEINS 



RELATED APPLICATIONS 

This application claims the benefit of priority to U.S. Provisional Patent 
5 Application Nos. 60/592,269, filed on July 29, 2004, and 60/592,297, filed on My 
28, 2004, which applications are hereby incorporated by reference in their entireties. 



BACKGROUND OF THE APPLICATION 

The presence of double-stranded RNA (dsRNA) in most eukaryotic cells 
10 provokes a sequence-specific silencing response known as RNA interference 
(RNAi) (G.J. Harmon, Nature 418, 244 (2002); A. Fire et al., Nature 391, 806 

(1998) ). The dsRNA trigger of this process can be derived from exogenous sources 
or transcribed from endogenous non-coding RNA genes that produce microRNAs 
(miRNAs) (Harmon, supra; G. Hutvagner et al., Curr. Opin. Genet. Dev. 12, 225 

1 5 (2002)). RNAi begins with the conversion of dsRNA silencing triggers into small 
RNAs of -21-26 nt in length (A. Hamilton et al., Embo J. 21, 4671 (2002)). This is 
accomplished by processing of triggers by specialized RNaselll family nucleases, 
Dicer and Drosha (E. Bernstein et al., Nature 409, 363 (2001); Y. Lee et al., Nature 
425, 415 (2003)). Resulting small RNAs join an effector complex, known as RISC 

20 (RNA-Induced Silencing Complex) (S.M. Hammond et al., Nature 404, 293 (2000)). 
Silencing by RISC can occur via several mechanisms. In flies, plants and fungi, 
dsRNAs can trigger chromatin remodeling and transcriptional gene silencing (M.F. 
Mette et al., Embo J. 19, 5194 (2000); I.M. Hall et al., Science 297, 2232 (2002); T. 
Volpe et al., Science 22, 22 (2002); M. Pal-Bhadra et al., Mol. Cell 9, 315 (2002)). 

25 RISC can also interfere with protein synthesis, and this is the predominant 

mechanism used by miRNAs in mammals (P.H. Olsen et al., Dev. Biol. 216, 671 

(1999) ; D.P. Barrel, Cell 116, 281 (2004)). However, the best-studied mode of 
RISC action is mRNA cleavage (T. Tuschl et al., Genes Dev. 13, 3191 (1999); P.D. 
Zamore, Cell 101, 25 (2000)). When programmed with a small RNA that is fully 
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complementary to the substrate RNA, RISC cleaves that RNA at a discrete position, 
an activity that has been attributed to an unknown RISC component, "Sheer" (S.M. 
Elbashir et al., Embo J. 20, 6877 (2001); J. Martinez et al., Cell 1 10, 563 (2002)). 
Whether or not RISC cleaves a substrate can be determined by the degree of 
5 complementarity between the siRNA and mRNA, as mismatched duplexes are often 
not processed (Elbashir et al., supra). However, even for mammalian miRNAs, 
which normally repress at the level of protein synthesis, cleavage activity can be 
detected with a substrate that perfectly matches the miRNA sequence (G. Hutvagner 
et al., Science 1, 1 (2002)). This prompted the hypothesis that all RISCs are equal 
1 0 with the outcome of the RISC-substrate interaction being determined largely by the 
character of the interaction between the small RNA and its substrate. 

RISC contains two signature components. The first is the small RNA, which 
co-fractionated with RISC activity in Drosophila S2 cell extracts (Hammond et al., 
supra) and whose presence correlated with dsRNA-programmed mRNA cleavage in 

1 5 Drosophila embryo lysates (Tuschl et al, supra; Zamore et al., supra). The second 
is an Argonaute protein, which was identified as a component of purified RISC in 
Drosophila (S.M. Hammond et al., Science 293, 1146 (2001)). Subsequent studies 
have suggested that Argonautes are also key components of RISC in mammals, 
fungi, worms, protozoans and plants (Martinez et al., supra; MA. Carmell et al., 

20 Nat. Struct. Mol. Biol. 11,214 (2004)). To date, the identity of "Sheer" and the 
function of Argonaute proteins are unknown. 

BRIEF SUMMARY OF THE APPLICATION 

This application provides methods and compositions related to Argonaute 
proteins. 

25 A first aspect of application provides a crystalline Argonaute. Certain 

embodiments provide an isolated and purified Argonaute protein having a three- 
dimensional structure defined by the atomic coordinates such as for example as 
shown in Table 3. The crystalline Argonaute may comprise an archae Argonaute 
protein. Alternatively, the crystalline Argonaute may comprise a mammalian 

30 Argonaute protein, e.g., a human Argonaute protein such as human Ago-2. 
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Examples of mammalian Argonaute proteins may be Ago-1 , Ago-2, Ago-3, or Ago- 
4. 

In certain embodiments, a crystalline Argonaute may comprise an Argonaute 
protein having an amino acid sequence that is 95% identical to SEQ ID NO: 2 (or 
5 human Ago-2) or a homologue, fragment, variant, or derivative thereof. 

Alternatively, a crystalline Argonaute may comprise an Argonaute protein having an 
amino acid sequence that is 95% identical to SEQ ID NO: 2 (or human Ago-2) or a 
homologue, fragment, variant, or derivative thereof. 

Certain embodiments provide a crystalline Argonaute comprising a three- 
10 dimensional structure defined by all or a portion of the atomic co-ordinates such as 
for example as set forth in Table 3. 

The application also provides native crystals, derivative crystals or co- 
crystals, that have a root mean square deviation ("r.m.s.d.") of less than or equal to 
about 1.5 Angstrom when superimposed, using backbone atoms (N, Co, C and O), 
1 5 on the structure coordinates listed in Table 3. 

A crystalline Argonaute of the application may comprise at least two 
domains, e.g., a PAZ domain and a PIWI domain. A PIWI domain comprises a 
carboxylate triad formed by the motif "DDX" (X refers to a third amino acid, e.g., 
E). A crystalline Argonaute of the application may comprise a PIWI domain having 
20 a carboxylate triad formed by D597, D669, and a third amino acid. 

A crystalline Argonaute of the application may comprise the following 
overall architecture: the N-terminus, middle, and PIWI domains form a crescent- 
shaped base; and the PAZ domain is positioned above the crescent shaped base; 
resulting in a cleft between said crescent-shaped base and the PAZ domain. 

25 In certain embodiments, a crystalline Argonaute permits an X-ray 

crystallography resolution better than 2.25 Angstrom. 

In certain embodiments, a crystalline Argonaute is soaked with one or more 
agents to form co-complex structures. 



3 



WO 2006/015258 



PCT/US2005/027084 



A crystalline Argonaute may comprise a PIWI domain having an active site 
defined by two or more amino acids, such as for example the "DDX" (X 
representing a third amino acid, e.g., E) triad. A crystalline Argonaute may 
comprise a PAZ domain having an active site defined by two or more amino acids. 
5 In certain embodiments, an active site is capable of accommodating an agent, e.g., a 
ligand or an inhibitor. A ligand or an inhibitor may be a nucleic acid molecule, a 
peptidomimetic, or a small organic molecule. A ligand or an inhibitor may be 
soaked in to form a co-complex. A nucleic acid molecule that is a ligand or an 
inhibitor can be a single stranded RNA molecule, e.g., a single stranded RNA 
1 0 molecule comprising between 1 5-50 nucleotides . 

The application further provides an isolated complex comprising an 
Argonaute protein and a single stranded RNA molecule hybridized to its target 
nucleic acid. In certain embodiments, the single stranded RNA molecule is bound to 
the PAZ domain of the Argonaute protein. In certain embodiments, the target 
1 5 nucleic acid further interacts with the crescent-shaped base of the Argonaute protein. 

A further aspect of the application provides a method of determining the 
three-dimensional structure of an Argonaute protein or a mutant, derivative, variant, 
analogue, homologue, sub-domain or fragment thereof. The method may comprise 
aligning the amino acid sequence of the Argonaute mutant, derivative, variant, 

20 analogue, homologue, sub-domain or fragment with the amino acid sequence of 
PfAgo or as set forth in SEQ ID NO: 5 to match homologous regions of the amino 
acid sequences. The method may further comprise modeling the structure of the 
matched homologous regions of said target Argonaute protein of unknown structure 
on the corresponding regions of the Argonaute protein structure as defined by the 

25 atomic co-ordinates as set forth in Table 3. The method may also comprise 

determining a conformation for the Argonaute mutant, derivative, variant, analogue, 
homologue, sub-domain or fragment which substantially preserves the structure of 
said matched homologous regions. 

A further aspect of the application provides a method of identifying an agent 
30 that binds an Argonaute protein. The method may comprise applying a 3- 
dimensional molecular modeling algorithm to the atomic coordinates of an 
4 
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Argonaute protein shown in Table 3 to determine the spatial coordinates of the 
binding pocket of the Argonaute protein. The method may further comprise 
electronically screening the stored spatial coordinates of a set of candidate agents 
against the spatial coordinates of the Argonaute protein binding pocket to identify 
5 agents that can bind to the Argonaute protein. 

The application also provides a computer-based method for the analysis of 
the interaction of a molecular structure with an Argonaute protein. The method may 
comprise providing a structure comprising a three-dimensional representation of said 
Argonaute protein or a portion thereof, which representation comprises all or a 
10 portion of the coordinates set forth in Table 3 . The method may further comprise 
providing a molecular structure to be fitted to said Argonaute protein structure. The 
method may also comprise fitting the molecular structure to the Argonaute protein 
structure, e.g., as set forth in the three-dimensional representation. 

The application also provides a computer-readable storage medium encoded 
1 5 with the atomic coordinates or an Argonaute protein as shown in Table 3 . Other 
embodiments also provide a data array comprising the atomic coordinates of an 
Argonaute protein as set forth in Table 3. 

The application further provides an electronic representation of a crystal 
structure of an Argonaute protein. In certain embodiments, the electronic 

20 representation may contain atomic coordinate set forth in Table 3. Certain 

embodiments also provide an electronic representation of a binding site of the 
Argonaute protein. The binding site may locate in or be defined by the PAZ and/or 
PIWI domain or a portion thereof. Certain embodiments also provide an electronic 
representation of a domain of the Argonaute protein, e.g., a PIWI domain and/or a 

25 PAZ domain. Certain embodiments also provide an electronic representation of an 
agent in a binding site of an Argonaute protein, e.g., an active site of the Argonaute 
protein. 

The crystal structure, the electronic representation, as well as other aspects of 
the application also relate to a method for identifying, designing, and/or optimizing 
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an RNAi construct or RNAi therapeutic of the invention, e.g., to improve an RNAi 
therapeutic's pharmacokinetic and/or pharmacodynamic profile. 

Another aspect of the application relates to a method of obtaining a crystal 
formed by an Argonaute protein. The crystal may be grown using a precipitant. The 
5 crystal may be grown in a buffer, the pH of which buffer may be varied. The crystal 
may also be grown in the presence of a ligand or an inhibitor that interacts with the 
Argonaute protein, e.g., a domain of the Argonaute protein. The quality of the 
crystal can be improved by microseeding. 

A further aspect of the application relates to a method of identifying an agent 
1 0 that modulates the activity of an RNAi construct. The method may comprise 

identifying an agent that modulates the expression and/or activity of an Argonaute 
protein. The method may involve an Argonaute protein expressed in a cell. The 
expressed Argonaute protein may be endogenous or exogenous to the cell. In 
certain embodiments, the agent can modulate (e.g., increase) the RNase activity of 
1 5 the Argonaute protein. The agent may alternatively or further modulate (e.g., 

increase) the expression of said Argonaute gene. In certain embodiments, an agent 
modulates the RNase activity and/or expression of an Argonaute protein in a tissue 
or cell type-specific manner. 

In certain embodiments, the application relates to a method of identifying an 
20 agent that modulates the activity of an RNAi therapeutic. The method may 

comprise identifying an agent that modulates the expression and/or activity of an 
Argonaute protein. The method may involve an Argonaute protein expressed in a 
cell. The expressed Argonaute protein may be endogenous or exogenous to the cell. 
In certain embodiments, the agent can modulate (e.g., increase) the RNase activity 
25 of the Argonaute protein. The agent may alternatively or further modulate (e.g., 
increase) the expression of said Argonaute gene. In certain embodiments, an agent 
modulates the RNase activity and/or expression of an Argonaute protein in a tissue 
or cell type-specific manner. 

In certain embodiments, an RNAi construct or an RNAi therapeutic 
30 attenuates the expression of a target nucleic acid molecule. The attenuation may be 
6 
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by 2, 3, 5, 10, or higher fold. The target nucleic acid molecule may comprise an 
endogenous nucleic acid molecule. Alternatively, the target nucleic acid molecule is 
a heterologous to the genome of the cell. The heterologous nucleic acid molecule 
may be a nucleic acid from a pathogen. 

5 An RNAi construct or an RNAi therapeutic of the application may comprise 

a nucleotide sequence at least 15 nucleotides in length that hybridizes to a target 
nucleic acid molecule. In certain embodiments, an RNAi construct or an RNAi 
therapeutic may comprise a hairpin nucleic acid. An RNAi construct or an RNAi 
therapeutic of the application may also comprise a promoter operably linked to a 
10 nucleotuide sequence that hybridizes to a target nucleic acid molecule. The 
promoter may be tissue or cell type-specific. 

A further aspect of the application relates to a method of identifying an agent 
that potentiates the activity of an RNAi construct. The method may comprise 
identifying an agent that increases the expression and/or activity of an Argonaute 
1 5 protein. The agent may increase the expression and/or activity of an Argonaute 
protein in a tissue or cell type-specific manner. 

Certain embodiments provides a method of identifying an agent that 
potentiates the activity of an RNAi therapeutic. The method may comprise 
identifying an agent that increases the expression and/or activity of an Argonaute 
20 protein. The agent may increase the expression and/or activity of an Argonaute 
protein in a tissue or cell type-specific manner. 

Another aspect of the application provides a method of identifying an agent 
that modulates the activity of an RNAi construct. The method may comprise 
providing an isolated or recombinant Argonaute protein and assaying the RNase 

25 activity of the Argonaute protein in the presence of a candidate agent. A change in 
the RNase activity of the Argonaute protein in the presence of a candidate agent is 
indicative of the candidate agent capable of modulating the activity of the RNAi 
construct. The change may be relative to the RNase activity of the Argonaute 
protein in the absence of the candidate agent or a baseline or control level of the 

30 RNase activity of Argonaute protein. The method may involve an Argonaute 
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protein expressed in a cell. Alternatively, the method may involve an isolated or 
purified Argonaute protein. The method may further comprise determining the 
RNase activity of said Argonaute protein in the absence of a candidate agent. The 
identified agent may modulate the activity of an RNAi construct in a tissue or cell 
5 type-specific manner. 

Certain embodiments provide a method of identifying an agent that 
modulates the activity of an RNAi therapeutic. The method may comprise providing 
an isolated or recombinant Argonaute protein and assaying the RNase activity of the 
Argonaute protein in the presence of a candidate agent. A change in the RNase 

10 activity of the Argonaute protein in the presence of a candidate agent is indicative of 
the candidate agent capable of modulating the activity of the RNAi therapeutic. The 
change may be relative to the RNase activity of the Argonaute protein in the absence 
of the candidate agent or a baseline or control level of the RNase activity of 
Argonaute protein. The method may involve an Argonaute protein expressed in a 

15 cell. Alternatively, the method may involve an isolated or purified Argonaute 

protein. The method may further comprise determining the RNase activity of said 
Argonaute protein in the absence of a candidate agent. ' The identified agent may 
modulate the activity of an RNAi construct in a tissue or cell type-specific manner. 

A further aspect of the application provides a composition for targeted gene 
20 inhibition comprising an agent that modulates the RNase activity of an Argonaute 
protein. The composition may further comprise an RNAi construct or an RNAi 
therapeutic targeting a gene. In certain embodiments, an agent may potentiate the 
RNase activity of the Argonaute protein. Alternatively, an agent may inhibit the 
RNase activity of the Argonaute protein. In certain embodiments, the RNAi 
25 construct or therapeutic may target a gene in a first tissue or cell type; the identified 
agent may potentiate the RNase activity of the Argonaute protein in said first tissue 
or cell type. In certain embodiments, the identified agent may inliibit the RNase 
activity of the Argonaute protein in a second tissue or cell type. 

The application also provides a pharmaceutical preparation comprising the 
30 compositions described herein and a physiologically acceptable carrier. 

8 
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A further aspect of the invention relates to a cell line that overexpresses an 
Argonaute protein. The cell line of claim may overexpress a mammalian Argonaute 
protein, e.g., a human Agonaute protein. A mammalian Agonaute protein may be 
Ago-1, Ago-2, Ago-3, or Ago-4. The cell line may alternatively overexpress an 
5 Argonaute protein having an amino acid sequence that is 95% identical to an amino 
acid sequence as set forth in SEQ ID NOs.: 1-4, or a homologue, fragment, variant, 
or derivative thereof. The cell line may alternatively overexpress an Argonaute 
protein encoded by a nucleic acid molecule having a sequence that is 95% identical 
to a nucleic acid sequence as set forth in any one of SEQ ID NOs.: 1-4. The cell line 
1 0 may alternatively overexpress an Argonaute protein encoded by a nucleic acid 

molecule that hybridizes under high stringency conditions to a nucleic acid sequence 
as set forth in any one of SEQ ID NOs.: 1-4. The cell line may alternatively 
overexpress an Argonaute protein having an amino acid sequence set forth in any 
one of SEQ ID NOs.: 1-4. 

1 5 Another aspect of the application relates to a cell line that expresses a mutant 

Argonaute protein comprising an amino acid sequence that is different from a 
naturally-occurring Argonaute protein. 

A further aspect of the application relates to a host (e.g., a cell or an animal) 
wherein the expression of an endogenous Argonaute protein is controlled by, e.g., a 
20 transgene (or a nucleic acid construct such as for example the construct based on the 
Puro PGK vector described herein). 

The application also provides an assay for identifying nucleic acid sequences 
for conferring a particular phenotype in a cell, comprising constructing a library of 
nucleic acid sequences oriented to produce double stranded RNA. The assay may 
25 further comprise ntroducing a dsRNA library into a culture of target cells. The assay 
may also comprise identifying members of the library which confer a particular 
phenotype on the cell, and identifying the sequence from the cell which is identical 
or homologous to the library member. 



30 



Another aspect of the invention provides a nucleic acid composition 
comprising a first nucleic acid comprising an RNAi construct and a second nucleic 
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acid encoding an Argonaute protein. The RNAi construct may comprise a 
nucleotide sequence encoding a single-strand siRNA; the nucleotide sequence may 
be operably linked to a promoter. In certain embodiments, the second nucleic acid 
encodes a human Argonaute protein and may be operably linked to a promoter. 
5 Alternatively, the second nucleic acid may encode a non-naturally-occurring 

Argonaute protein. In certain embodiments, the RNAi construct may be tissue or 
cell type-specific. The promoters may be tissue or cell type-specific. 

A further aspect of the application provides a cell expressing any of the 
nucleic acid compositions described herein. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 shows the crystal structure of Pyrococcus furiosus Argonaute. Stereo ribbon 
representation of Argonaute with the N-terminal domain shown in blue, the "stalk" 
in light blue, the PAZ domain in red, the middle domain in green, the PIWI domain 
in purple and the interdomain connector in yellow. The active site residues are 
1 5 drawn in stick representation. Disordered loops are drawn as dotted lines. The N- 
terminal, middle and PIWI domains form a crescent base. The "stalk" holds the 
PAZ domain above the crescent base and the interdomain connector cradles the 
molecule. This figure as well as figures 2A, 3A,B, 5B were prepared with 
BobScript (60), MolScript (61) and Raster3D (62, 63). 

20 Figs. 2A-2B show that the PAZ domains of Pf Ago and hAgo 1 have very similar 

structures. (Fig. 2A) Stereo diagram of the superposition of Ca atoms from the PAZ 
domain of PfAgo in shown in red and the PAZ domain of hAgo 1 shown in gray. 
Dotted lines represent disordered regions. (Fig. 2B) Sequence alignment of the PAZ 
domains of PfAgo, hAgol and DmAgo2 based on the structural superposition of the 

25 three domains. The sequence of PfAgo-PAZ domain could not be readily aligned 
with PAZ domains from other species without knowledge of the structure. The 
secondary structure elements for PfAgo are shown above the sequence. 

Figs. 3A-3C show that PIWI is an RNase H domain. (Fig. 3A) Ribbon diagrams of 
the PIWI domain, E. coli RNase Ffl and M.jannaschii RNase HH. The three 

10 
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structures were superimposed and shown in a similar view with the secondary 
structure elements of the canonical RNase H fold in color. The active site residues 
are shown in stick representation. (Fig. 3B) A close-up view of the active sites. This 
view is rotated -180° compared to the view in A. One active site aspartate is always 
5 located on p 1 of the fold (the red strand) in this family of proteins and another 

aspartate is always located on p4 of the fold (the green strand). The third active site 
carboxylate, a glutamic acid, varies in its position. The Mg 2+ ion in RNase HI is 
shown as a pink sphere. A strong difference election density found in the active site 
of PIWI that was assigned as a water molecule is shown as a green sphere. (Fig. 3C) 
10 Sequence alignment of the PIWI domains from Pf Argonaute and the four human 
Argonaute proteins. Invariant residues are highlighted in purple and conserved 
residues are highlighted in blue. The secondary structure elements are shown above 
the structure. The conserved active site carboxylate residues are marked by a red 
asterisk. 

15 Figs. 4A-4B show siRNA binding. (Fig. 4A) A 5'-phosphorylated ss-siRNA (4 nM) 
was radiolabled by phosphorylation with y- 32 P-ATP and hybridized with an 
unlabeled complementary strand to yield a ds-siRNA and was gel purified. The ss- 
and ds-siRNAs were UV-crosslinked to PfAgo and the adducts were resolved by 
SDS-PAGE. PfAgo binds preferentially to the ss-siRNA compared to the ds- 

20 siRNA. (Fig. 4B) Competition experiments were performed with the same labeled 
ss-siRNA and UV-crosslinking to PfAgo in the presence of increasing amounts of 
the indicated competitors (from 0 to 400 nM), showing preferential binding to a 5'- 
phosphorylated ss-siRNA compared to unphosphorylated ss-siRNA. 

Figs. 5A-5C illustrate a model for siRNA-guided mRNA cleavage by Argonaute. 

25 (Fig. 5 A) Two views of the electrostatic surface potential of PfAgo indicating a 

positively charged groove suitable for interaction with nucleic acids. The locations 
of the domains are labeled and the approximate location of the active site in PIWI is 
marked by a yellow asterisk. The view on the left is slightly tilted on the horizontal 
axis compared to the view in Figure 1 . Two of the loops were removed for a better 

30 view of the groove. The binding groove runs horizontally across the protein bending 
upwards between the PAZ and N-terminal domains on the right and bending around 
11 
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between the PAZ and middle domains on the left. The view on the right is from the 
proposed exit groove of the mRNA and looking into the active site area (rotated 
-90° compared to the view on the left). The PI WI domain is behind the middle 
domain in this view. The coloring scheme depicts potentials < -10 k B T in red and > 
5 1 0 k B T in blue, where k B is the Boltzman constant and T is the absolute temperature. 
This figure was prepared with GRASP (64). 

(Fig. 5B) A model for si-RNA and mRNA binding. Argonaute is shown as a ribbon 
representation in gray. A 3' portion of the siRNA, shown in purple, was placed by 
superposition of the PAZ domain of the hAgol-PAZ domain-RNA complex on the 

1 0 PAZ domain of PfAgo. The two nucleotides at the 3 '-end of the siRNA are inserted 
in the PAZ cleft and the nucleotides 5' to those bind along the PAZ domain. The 
passenger strand of the liAgol-PAZ complex placed in a similar manner was used to 
model the mRNA strand, shown in light blue, by extending the RNA 2 nucleotides 
at the 5 '-end, and from the middle of that strand along the binding groove towards 

1 5 the active site in PIWI. The 5 '- end of the mRNA is nested between the PAZ and N- 
terminal domains, across the stalk. The phosphate between the 1 1th and 12th 
nucleotides from the 5'-end of the mRNA falls near the active site residues shown in 
red. 

(Fig. 5C) Schematic depiction of the model for siRNA-guided mRNA cleavage. 

20 The domains are colored as in Fig. 1. The siRNA, shown in yellow, binds with its 
3'-end in the PAZ cleft and the 5' is predicted to reach the other end of the molecule 
and likely bind there. The mRNA is depicted in brown, comes in between the N- 
terminal and PAZ domains and out between the PAZ and middle domain. The 
active site in the PIWI domain, depicted as scissors, cleaves the mRNA opposite the 

25 middle of the siRNA guide. 

Fig. 6 shows sequence alignment of the PAZ domains of PfAgo, hAgol and 
DmAgo2 based on the structural superposition of the three domains. The sequence 
of PfAgo-PAZ domain could not be readily aligned with PAZ domains from other 
species without knowledge of the structure. Invariant residues are highlighted in 
30 purple and conserved residues are highlighted in blue. The secondary structure 
elements for PfAgo are shown above the sequence. 
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Fig. 7 shows sequence alignment of the PIWI domains from Pf Argonaute and the 
four human Argonaute proteins. Invariant residues are highlighted in purple and 
conserved residues are highlighted in blue. The secondary structure elements are 
shown above the structure. The conserved active site carboxylate residues are 
5 marked by a red asterisk. Accession numbers are as follows: PfAgo (AAL80661), 
hAgol (NM_012199), hAgo2 (NM_012154), hAgo3 (NM_024852) and hAgo4 
(NM_0 17629). 

Fig. 8 shows another view of the electrostatic surface potential of PfAgo shown 
from the proposed exit groove of the mRNA and looking into the active site area 
10 (rotated -90° around y and -20° around x compared to the in Fig. 4A). The PIWI 
domain is behind the middle domain in this view. 

Fig. 9 shows that only mammalian Ago2 can form cleavage-competent RISC. Panel 
A: The miRNA populations associated with Agol, Ago2 and Ago3 were measured 
by microarray analysis as described in Methods. The heat map shows normalized 
log-ratio values for each dataset, with yellow representing increased relative 
amounts, and blue indicating decreased amounts, relative to the median. The top 25 
log-ratios are shown in the expanded region. In each panel, "control" indicates 
parallel analysis of cells transfected with a vector control. Panel B: 293T cells were 
transfected with a control vector or with vectors encoding myc-tagged Agol , Ago2 
or Ago3, as indicated, along with an siRNA that targets firefly luciferase. 
Immunoprecipitates were tested for siRNA directed mRNA cleavage as described in 
Methods. Positions of 5' and 3' cleavage products are shown. Panel C: 
Immunoprecipitates as in Panel B were tested for in vivo siRNA binding by 
Northern blotting of Ago immunoprecipitates (see Methods). Panel D: Western 
blots of transfected cell lysates show similar levels of expression for each 
recombinant Argonaute protein. 

Fig. 10 shows that Argonaute2 is essential for mouse development. Panel A: Total 
RNA from Wild-type or mutant embryos was tested for expression of Agol, Ago2 
or Ago3 by RT-PCR. Actin was also examined as a control. Panel B: At day E10.5, 
30 Ago2 null embryos show severe developmental delay as compared to heterozygous 
and wild-type littermates. These embryos also show a variety of developmental 
13 
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defects including swelling inside the pericardial membrane (Panel C, h=heart, 
indicated by the arrow) and failure to close the neural tube (Panel D, Panel E). 
Arrows in Panel D indicate the edges of the neural tube that has failed to close. In 
caudal regions where the neural tube does close, it has an abnormal appearance, 
5 being wavy as compared to wild-type embryos (Panel E, compare wt and Ago2 -/-). 
Ago2 is expressed in most tissues of the developing embyo as measured by in situ 
hybridization (Panel F) or analysis of an Ago2 gene trap animal (Panel G). In Panel 
F, f=forebrain, b=branchial arches, h=heart and lb=limb bud, all of which are 
relative hot spots for Ago2 mRNA. In Panel G, the left embryo shows similar 
10 patterns when staining for the gene-trap marker, p-galactosidase, proceeds for only a 
short period. Longer incubation (Panel G, right) gives uniform staining throughout 
the embryo. 

Fig. 1 1 shows that Argonaute2 is essential for RNAi in MEF. Panel A: RT-PCR of 
mRNA prepared from Wild-type or Ago2-/- MEF reveals consistent expression of 

15 Agol and Ago3 but a specific lack of Ago2 expression in the null MEF. Actin 
mRNA serves as a control. Panel B: Wild-type and mutant MEFs were co- 
transfected with plasmids encoding Renilla and firefly luciferases either with or 
without firefly siRNA as indicated. Ratios of firefly to Renilla activity, normalized 
to 1 for the no-siRNA control were plotted. For each genotype, the ability of Agol 

20 and Ago2 to rescue suppression was tested by co-transfection with expression 
vectors encoding each protein as indicated. Panel C: NTH-3T3 cells, Wild-type 
MEF or Ago2 mutant MEF were tested as described in B (except that Renilla/firefly 
ratios are plotted) for their ability to suppress a reporter of repression at the level of 
protein synthesis. In this case, the Renilla luciferase mRNA contains multiple, 

25 imperfect binding sites for a CXCR4 siRNA. Cells were transfected with a mixture 
of firefly and Renilla luciferase plasmids with or without (as indicated) the siRNA. 

Fig. 12 shows mapping of the requirements for assembly of cleavage-competent 
RISC. Agol, Ago2 or the indicated mutants of Ago2 were expressed as myc-tagged 
fusion proteins in 293T cells. In all cases, expression constructs were co-transfected 
30 with a luciferase siRNA. Western Blotting (not shown) indicated similar expression 
for each mutant. Immunoprecipitate containing individual proteins were tested for 
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cleavage activity against a luciferase mRNA. Positions of 5' and 3' cleavage 
products are indicated. SiRNA binding was examined for each mutant by Northern 
blotting of immunoprecipitates or by staining of immunoprecipitates with Sybr Gold 
(Molecular Probes). Representatives for these assays are shown. In no case was a 
5 defect in interaction of mutants with siRNAs detected. 

Fig. 13 shows that Argonaute2 is a candidate for Sheer. Panel A: Ago2 protein was 
immunoaffmity purified from transiently transfected 293T cells. The preparation 
contained two major proteins (Protein Gel), in addition to heavy and light chains. 
These were identified by mass spectrometry as Ago2 and HSP90. 

10 Immunoprecipitates were mixed (see Methods) in vitro with single- or double- 
stranded siRNAs or with a 21 nt DNA having the same sequence as the siRNA, as 
indicated. Reconstituted RISC was tested for cleavage activity with a uniformly 
labeled synthetic mRNA. Positions of 5' and 3' cleavage products are noted. Where 
indicated, the siRNA was not 5' phosphorylated and in one case, ATP was not added 

1 5 to the reconstitution reaction. Panel B: Ago2 or Ago2 mutants (as indicated) were 
assembled into RISC in vivo by co-transfection with siRNAs followed by 
immunoaffmity purification or by in vitro reconstitution, mixing affinity purified 
proteins with ss-siRNAs. These were tested for activity against a complementary 
mRNA substrate. 5' and 3' cleavage products are as in Panel A. Both mutant 

20 proteins were expressed at levels similar to wild-type Ago2 and bound siRNAs as 
readily (Panel C, Panel D) Ago2 (H634P) and (Q633R) behave similarly in this 
assay. 

Fig. 14 shows cleavage by Ago2 -containing RISC irrespective of siRNA sequence. 
Ago2-containing RISCs were formed in vivo by co-transfection. Complexes were 
25 recovered by immunoprecipitation and tested for cleavage activity with a uniformly 
labeled, synthetic mRNA. Positions of 5', and 3' cleavage products expected for 
each reaction are indicated. 

Fig. 15 shows construction of Ago2 mutant mice. The msertional disruption 
strategy for inactivating mouse Ago2 is shown, along with a southern blot of DNA 
30 from wild-type, heterozygous, and null embryos. Probe is indicated by asterisk. For 
reference, PAZ domain is encoded by exons 5-8. The insertion duplicates exons 3- 
15 
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6, which includes two exons of the PAZ domain, and inserts -10 Kb of vector 
sequences into the gene, creating a high probability that any truncated protein that 
might be generated from this allele would be non-functional. Additionally, no Ago2 
mRNA was detected from these cells by RT-PCR. However, all of the coding 
5 capacity of Ago2 does still exist in the mutant genome. Therefore, although all 
available evidence indicates a null mutation, the possibility cannot be completely 
ruled out that this mutant can still synthesize a small amount of Ago2, making it a 
severe hypomorph rather than a null. Southern blots showing the patterns for Wild- 
type, hetero2ygous and mutant animals are shown below the disruption strategy. 

10 Fig. 16 shows expression analysis of Ago3 in embryos. Embryonic day 9.5 embryos 
were collected from timed matings of Wild-type animals. These were stained for 
expression of Ago3 mRNA by in situ hybridization as described in Methods. Ago3 
shows the same expression pattern as is seen in parallel analyses of Ago2 mRNA 
expression (see Fig. 10, Panel F). 

1 5 Fig. 1 7 shows that Ago2-mutant MEF are defective for siRNA-mediated repression 
WT and Ago2-mutant MEF (genotypes indicated on the left) were transfected with a 
combination of plasmids encoding dsRed and GFP, either with or without GFP 
siRNAs (as indictated on the right). Microscopic examination revealed consistent 
co-expression of dsRed and GFP in the absence of siRNAs in both WT and mutant 

20 cells. SiRNAs eliminated co-expression of GFP in WT cells but did not alter GFP 
expression in Ago2-/- cells. 

Fig. 1 8 shows that intact Ago2 is required for formation of cleavage-competent 
RISC. Deletions within Ago2 are indicated schematically. Plasmids encoding 
epitope-tagged versions of each deletion mutant were co-transfected into 293T cells 
25 with an siRNA to firefly luciferase. Wild-type Ago2 was similarly expressed as a 
control. RISCs were immunoaffmity purified and tested for activity against a 
uniformly labeled mRNA substrate. Each protein was expressed as indicated by 
Western blotting with a myc antiserum, but none of the deletion mutants bound 
siRNAs, as determined by Nothern blotting of immunoprecipitates. 
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Fig. 19 shows that Ago2 can be reconstituted with different siRNAs. Ago2 was 
immunoaffinitity purified (see Fig. 13) and reconstituted in vitro with single 
stranded siRNAs that target either the sense strand or the antisense strand of a firefly 
luciferase mRNA. Similar complexes were formed in parallel with purified Agol . 
5 In each case, Ago2 cleaved the complementary mRNA, whereas Ago 1 complexes 
were inert. Positions of 5', and 3' cleavage products are indicated. 

Fig. 20 shows that RISC is a metal-dependent nuclease. As previously shown, RISC 
requires a divalent metal for activity (Hannon, supra). Similarly, RISC, 
reconstituted in vitro with single-stranded siRNAs, depends on Mg++ for activity, as 
1 0 indicated by the ability to inhibit the complex with EDTA but not with EGTA (as 
indicated). 

Fig. 21 shows that active site residues are conserved among Ago proteins. Putative 
active site aspartate residues in the PIWI domain were identified with reference to 
the structure of the P.furiosus Ago protein. These were also conserved in Ago 
15 proteins from a variety of species. Additionally, residues identified by a mutational 
analysis (e. g. H634) were also highly conserved. 

Fig. 22 shows sequence alignment of mammalian Agol family members. An 
alignment of the protein sequences of human Argonautesl-4 highlights a very high 
degree of sequence conservation. Red indicates highly conserved, blue moderately 
20 conserved residues. Residues mutated in Ago2 in this study are indicated in green 
and by asterisks (see below). The PAZ domain is indicated by the yellow bar and 
the PIWI by the orange bar (boundaries set as determined by structural data). 
Accession numbers for individual genes are as follows: Agol (NM_012199), Ago2 
(NM012154), Ago3(NM_024852), Ago4 (NM_017629). 

25 Fig. 23 shows Table 1 which provides crystallographic statistics for Argonaute. 

Fig. 24 shows Table 2 which provides additional crystallographic statistics for 
Argonaute. 

Fig. 25 shows Table 3 which provides the atomic coordinates for Argonaute. 
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DETAILED DESCRIPTION OF THE APPLICATION 

Overview 

Argonautes are often present as multiprotein families and are identified by 
two characteristic domains, PAZ and PIWI (21). These proteins mainly segregate 
5 into two sub-families, comprising those that are more similar to either Arabidopsis 
Argonautel or Drosophila Piwi. The Argonaute family was first linked to RNAi 
through genetic studies in C. elegans, which identified Rde-1 as a gene essential for 
silencing (22). Subsequent placement of a Drosophila Argonaute protein in RISC 
(19) makes it desirable to explore the unknown roles of this protein family. Toward 

10 this end, this application provides methods and compositions related to Argonaute. 
These methods and compositions are based on results obtained from structural 
studies of Argonaute proteins, as well as biochemical, and genetic studies of a 
subfamily of Argonaute proteins in mammals. As used herein, the term "Argonaut" 
refers to a protein which (a) mediates an RNAi response and (b) has an amino acid 

15 sequence at least 50 percent identical, and more preferably at least 75, 85, 90 or 95 
percent identical to SEQ ID NOs: 1-5. 

Structural Studies of Argonaute 

The crystal structure of Argonaute is useful for in silico screening of agents 
that bind to Argonaute and/or modulates its activity. The candidate agents generated 
20 from the in silico screening can be further screened in biochemical assays to select 
for agents that modulate the activity of Argonaute. 

I. Crystallization and Structure Determination 

X-ray crystallography is a method of solving the three dimensional structures 
of molecules. The structure of a molecule is calculated from X-ray diffraction 
25 patterns using a crystal as a diffraction grating. Three dimensional structures of 

protein molecules arise from crystals grown from a concentrated aqueous solution of 
that protein. The process of X-ray crystallography can include the following steps: 

(a) synthesizing and isolating (or otherwise obtaining) a polypeptide; 
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(b) growing a crystal from an aqueous solution comprising the polypeptide 
with or without a modulator; and 

(c) collecting X-ray diffraction patterns from the crystals, determining unit 
cell dimensions and symmetry, determining electron density, fitting the amino acid 
5 sequence of the polypeptide to the electron density, and refining the structure. 

a. Production of Polypeptides 

The Argonaute polypeptides described herein may be chemically synthesized 
in whole or part using techniques that are well-known in the art (see, e.g., Creighton 
(1983) Biopolymers 22(l):49-58). 

10 Alternatively, methods which are well known to those skilled in the art can 

be used to construct expression vectors containing the native or mutated Argonaute 
polypeptide coding sequence and appropriate transcriptional/translational control 
signals. These methods include in vitro recombinant DNA techniques, synthetic 
techniques and in vivo recombination/genetic recombination. See, for example, the 

15 techniques described in Maniatis, T (1989). Molecular cloning: A laboratory 

Manual. Cold Spring Harbor Laboratory, New York. Cold Spring Harbor Laboratory 
Press; and Ausubel, F. M. et al. (1994) Current Protocols in Molecular Biology 
(John Wiley & Sons, Secaucus, N.J.). 

A variety of host-expression vector systems may be utilized to express the 
20 Argonaute coding sequence. These include but are not limited to microorganisms 
such as bacteria transformed with recombinant bacteriophage DNA, plasmid DNA 
or cosmid DNA expression vectors containing the Argonaute domain coding 
sequence; yeast transformed with recombinant yeast expression vectors containing 
the Argonaute domain coding sequence; insect cell systems infected with 
25 recombinant virus expression vectors (e.g., baculovirus) containing the Argonaute 
domain coding sequence; plant cell systems infected with recombinant virus 
expression vectors (e.g., cauliflower mosaic virus, CaMV; tobacco mosaic virus, 
TMV) or transformed with recombinant plasmid expression vectors (e.g., Ti 
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plasmid) containing the Argonaute domain coding sequence; or animal cell systems. 
The expression elements of these systems vary in their strength and specificities. 

Depending on the host/vector system utilized, any of a number of suitable 
transcription and translation elements, including constitutive and inducible 
5 promoters, may be used in the expression vector. For example, when cloning in 
bacterial systems, inducible promoters such as pL of bacteriophage .lambda., plac, 
ptrp, ptac (ptrp-lac hybrid promoter) and the like may be used; when cloning in 
insect cell systems, promoters such as the baculovirus polyhedrin promoter may be 
used; when cloning in plant cell systems, promoters derived from the genome of 

10 plant cells (e.g., heat shock promoters; the promoter for the small subunit of 
RUBISCO; the promoter for the chlorophyll alb binding protein) or from plant 
viruses (e.g., the .sup.35S RNA promoter of CaMV; the coat protein promoter of 
TMV) may be used; when cloning in mammalian cell systems, promoters derived 
from the genome of mammalian cells (e.g., metallothionein promoter) or from 

1 5 mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K 
promoter) may be used; when generating cell lines that contain multiple copies of 
the Argonaute domain DNA, SV40-, BPV- and EBV-based vectors may be used 
with an appropriate selectable marker. 

Exemplary methods describing methods of DNA manipulation, vectors, 
20 various types of cells used, methods of incorporating the vectors into the cells, 
expression techniques, protein purification and isolation methods, and protein 
concentration methods are disclosed in detail in PCT publication WO 96/18738. 
This publication is incorporated herein by reference in its entirety, including any 
drawings. Those skilled in the art will appreciate that such descriptions are 
25 applicable to the present invention and can be easily adapted to it. 

b. Crystal Growth 

Crystals are grown from an aqueous solution containing the purified and 
concentrated Argonaute polypeptide by a variety of techniques. These techniques 
include batch, liquid, bridge, dialysis, vapor diffusion, and hanging drop methods. 
30 McPherson (1982) John Wiley, New York; McPherson (1990) Eur. J. Biochem. 
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189:1-23; Webber (1991) Adv. Protein Chem. 41:1-36, incorporated by reference 
herein in their entireties, including all figures, tables, and drawings. 

The native crystals of the application are, in general, grown by adding 
precipitants to the concentrated solution of the polypeptide. The precipitants are 
5 added at a concentration just below that necessary to precipitate the protein. Water 
is removed by controlled evaporation to produce precipitating conditions, which are 
maintained until crystal growth ceases. 

For crystals of the application, exemplary crystallization conditions are 
described in the Examples. Those of ordinary skill in the art will recognize that the 
10 exemplary crystallization conditions can be varied. Such variations may be used 
alone or in combination. In addition, other crystallizations may be found, e.g., by 
using crystallization screening plates to identify such other conditions. 

c. X-Ray Diffraction 

Tire diffraction data from X-ray crystallography is generally obtained as 

1 5 follows. When a crystal is placed in an X-ray beam, the incident X-rays interact 
with the electron cloud of the molecules that make up the crystal, resulting in X-ray 
scatter. The combination of X-ray scatter with the lattice of the crystal gives rise to 
nonuniformity of the scatter; areas of high intensity are called diffracted X-rays. 
The angle at which diffracted beams emerge from the crystal can be computed by 

20 treating diffraction as if it were reflection from sets of equivalent, parallel planes of 
atoms in a crystal (Bragg' s Law). The most obvious sets of planes in a crystal lattice 
are those that are parallel to the faces of the unit cell. These and other sets of planes 
can be drawn through the lattice points. Each set of planes is identified by three 
indices, hkl . The h index gives the number of parts into which the a edge of the unit 

25 cell is cut, the k index gives the number of parts into which the b edge of the unit 
cell is cut, and the 1 index gives the number of parts into which the c edge of the 
unit cell is cut by the set of hkl planes. Thus, for example, the 235 planes cut the a 
edge of each unit cell into halves, the b edge of each unit cell into thirds, and the c 
edge of each unit cell into fifths. Planes that are parallel to the be face of the unit 

30 cell are the 100 planes; planes that are parallel to the ac face of the unit cell are the 
21 
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010 planes; and planes that are parallel to the ab face of the unit cell are the 001 
planes. 

When a detector is placed in the path of the diffracted X-rays, in effect 
cutting into the sphere of diffraction, a series of spots, or reflections, are recorded to 
5 produce a "still" diffraction pattern. Each reflection is the result of X-rays reflecting 
off one set of parallel planes, and is characterized by an intensity, which is related to 
the distribution of molecules in the unit cell, and hkl indices, which correspond to 
the parallel planes from which the beam producing that spot was reflected. If the 
crystal is rotated about an axis perpendicular to the X-ray beam, a large number of 
10 reflections is recorded on the detector, resulting in a diffraction pattern. 

The unit cell dimensions and space group of a crystal can be determined 
from its diffraction pattern. First, the spacing of reflections is inversely proportional 
to the lengths of the edges of the unit cell. Therefore, if a diffraction pattern is 
recorded when the X-ray beam is perpendicular to a face of the unit cell, two of the 

15 unit cell dimensions may be deduced from the spacing of the reflections in the x and 
y directions of the detector, the crystal-to-detector distance, and the wavelength of 
the X-rays. Those of skill in the art will appreciate that, in order to obtain all three 
unit cell dimensions, the crystal must be rotated such that the X-ray beam is 
perpendicular to another face of the unit cell. Second, the angles of a unit cell can 

20 be determined by the angles between lines of spots on the diffraction pattern. Third, 
the absence of certain reflections and the repetitive nature of the diffraction pattern, 
which may be evident by visual inspection, indicate the internal symmetry, or space 
group, of the crystal. Therefore, a crystal may be characterized by its unit cell and 
space group, as well as by its diffraction pattern. 

25 Once the dimensions of the unit cell are determined, the likely number of 

polypeptides in the asymmetric unit can be deduced from the size of the polypeptide, 
the density of the average protein, and the typical solvent content of a protein 
crystal, which is usually in the range of 30-70% of the unit cell volume. 

The diffraction pattern is related to the three-dimensional shape of the 
30 molecule by a Fourier transform. The process of determining the solution is in 
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essence a re-focusing of the diffracted X-rays to produce a three-dimensional image 
of the molecule in the crystal. Since re-focusing of X-rays cannot be done with a 
lens at this time, it is done via mathematical operations. 

The sphere of diffraction has symmetry that depends on the internal 
5 symmetry of the crystal, which means that certain orientations of the crystal will 
produce the same set of reflections. Thus, a crystal with high symmetry has a more 
repetitive diffraction pattern, and there are fewer unique reflections that need to be 
recorded in order to have a complete representation of the diffraction. The goal of 
data collection, a dataset, is a set of consistently measured, indexed intensities for as 
1 0 many reflections as possible. A complete dataset is collected if at least 80%, 
preferably at least 90%, most preferably at least 95% of unique reflections are 
recorded. In one embodiment, a complete dataset is collected using one crystal. In 
another embodiment, a complete dataset is collected using more than one crystal of 
the same type. 

1 5 Sources of X-rays include, but are not limited to, a rotating anode X-ray 

generator such as a Rigaku RU-200 or a beamline at a synchrotron light source, such 
as the Advanced Photon Source at Argonne National Laboratory. Suitable detectors 
for recording diffraction patterns include, but are not limited to, X-ray sensitive film, 
multiwire area detectors, image plates coated with phosphorus, and CCD cameras. 

20 Typically, the detector and the X-ray beam remain stationary, so that, in order to 
record diffraction from different parts of the crystal's sphere of diffraction, the 
crystal itself is moved via an automated system of moveable circles called a 
goniostat. The three dimensional (x, y, z) coordinates of Argonaute are shown in 
Table 3 (Figure 25) in the standard Protein Data Bank (PDB) format. (Bernstain F. 

25 C, et al. J. Mol. Biol., 1977, 122, 535). 

TABLE 3 -Atomic Coordinates (Figure 25). 

Once a dataset such as the one in Table 3 (Figure 25) is collected, the 
information is used to determine the three-dimensional structure of the molecule in 
the crystal. However, in the absence alone of a suitable molecular model, this 
30 cannot be done from a single measurement of reflection intensities because certain 
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information, known as phase information, is lost between the three-dimensional 
shape of the molecule and its Fourier transform, the diffraction pattern. This phase 
information must be acquired by methods described below in order to perform a 
Fourier transform on the diffraction pattern to obtain the three-dimensional structure 
5 of the molecule in the crystal. It is the determination of phase information that in 
effect refocuses X-rays to produce the image of the molecule. 

One method of obtaining phase information is by isomorphous replacement, 
in which heavy-atom derivative crystals are used. In this method, the positions of 
heavy atoms bound to the molecules in the heavy-atom derivative crystal are 
1 0 determined, and this information is then used to obtain the phase information 

necessary to elucidate the three-dimensional structure of a native crystal. (Blundel et 
al., 1976, Protein Crystallography, Academic Press). 

Another method of obtaining phase information is by molecular replacement, 
which is a method of calculating initial phases for a new crystal of a polypeptide or 

15 polypeptide co-complex whose structure coordinates are unknown by orienting and 
positioning a related polypeptide whose structure coordinates are known within the 
unit cell of the new crystal so as to best account for the observed diffraction pattern 
of the new crystal. To enable this, the related molecule must have a similar three 
dimensional structure. Briefly, the principle behind the method of molecular 

20 replacement is as follows. A suitable search model, whose three-dimensional 
structure is similar to that of the unknown target, is identified first. The search 
model is then rotated and translated within the unit cell of the unknown. For each 
position of the model, a set of structure factors of the model is computed. These 
calculated structure factors are then compared with the measured intensities of the 

25 unknown and expressed as correlation coefficients. The solution with the highest 
correlation coefficient is selected as the true solution. These concepts are discussed 
at length in the book "The Molecular Replacement Method edited by Rossmann 
(1972, Int. Sci. Rev. Ser. No 13, Gordon & Breach, New York). 

A third method of phase determination is multi-wavelength anomalous 
30 dispersion or MAD. In this method, X-ray diffraction data are collected at several 
different wavelengths from a single crystal containing at least one heavy atom with 
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absorption edges near the energy of incoming X-ray radiation. The resonance 
between X-rays and electron orbitals leads to differences in X-ray scattering that 
permits the locations of the heavy atoms to be identified, which in turn provides 
phase information for a crystal of a polypeptide. A detailed discussion of MAD 
5 analysis can be found in Hendrickson, 1 985, Trans. Am. Crystallogr. Assoc., 21:11; 
Hendrickson et al., 1990, EMBO J. 9:1665; and Hendrickson, 1991, Science 4:91. 

A fourth method of determining phase information is single wavelength 
anomalous w dispersion or SAD. In this technique, X-ray diffraction data are 
collected at a single wavelength from a single native or heavy-atom derivative 
10 crystal, and phase information is extracted using anomalous scattering information 
from atoms such as sulfur or chlorine in the native crystal or from the heavy atoms 
in the heavy-atom derivative crystal. A detailed discussion of SAD analysis can be 
found in Brodersen et al., 2000, Acta Cryst, D56:43 1-441. 

A fifth method of determining phase information is single isomorphous 
15 replacement with anomalous scattering or SIRAS. This technique combines 

isomorphous replacement and anomalous scattering techniques to provide phase 
information for a crystal of a polypeptide. X-ray diffraction data are collected at a 
single wavelength, usually from a single heavy-atom derivative crystal. Phase 
information obtained only from the location of the heavy atoms in a single heavy- 
20 atom derivative crystal leads to an ambiguity in the phase angle, which is resolved 
using anomalous scattering from the heavy atoms. Phase information is therefore 
extracted from both the location of the heavy atoms and from anomalous scattering 
of the heavy atoms. A detailed discussion of SIRAS analysis can be found in North, 
1965, Acta Cryst. 18:212-216; Matthews, 1966, Acta Cryst. 20:82-86. 

25 Once phase information is obtained, it is combined with the diffraction data 

to produce an electron density map, an image of the electron clouds that surround 
the molecules in the unit cell. The higher the resolution of the data, the more 
distinguishable are the features of the electron density map, e.g., amino acid side 
chains and the positions of carbonyl oxygen atoms in the peptide backbones, 

30 because atoms that are closer together are resolvable. A model of the 

macromolecule is then built into the electron density map with the aid of a computer, 
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using as a guide all available irtformation, such as the polypeptide sequence and the 
established rules of molecular structure and stereochemistry. Interpreting the 
electron density map is a process of finding the chemically realistic conformation 
that fits the map precisely. 

5 After a model is generated, the structure is refined. Refinement is the process 

of minimizing the function <£, which is the difference between observed and 
calculated intensity values (measured by an R-factor), and which is a function of the 
position, temperature factor, and occupancy of each non-hydrogen atom in the 
model. This usually involves alternate cycles of real space refinement, i.e., 

10 calculation of electron density maps and model building, and reciprocal space 
refinement, i.e., computational attempts to improve the agreement between the 
original intensity data and intensity data generated from each successive model. 
Refinement ends when the function # converges on a minimum wherein the model 
fits the electron density map and is stereochemically and conformationally 

15 reasonable. During refinement, ordered solvent molecules are added to the 
structure. 

d. Various representations 

The atomic structure coordinates and machine readable media of the 
application have a variety of uses. The present invention encompasses the structure 

20 coordinates and other information, e.g., amino acid sequence, connectivity tables, 
vector-based representations, temperature factors, etc., used to generate the three- 
dimensional structures of the polypeptides for use in the software programs 
described below and other software programs. For example, the coordinates listed 
in Table 3 (Figure 25) are useful for solving the three-dimensional crystal or solution 

25 structures of other proteins to high resolution. 

Additionally, the invention encompasses machine readable media embedded 
with the three-dimensional structures of the models described herein, or with 
portions thereof. As used herein, "machine readable medium" or "computer 
readable medium" refers to any medium that can be read and accessed directly by a 
30 computer or scanner. Such media include, but are not limited to: magnetic storage 
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media, such as floppy discs, hard disc storage medium and magnetic tape; optical 
storage media such as optical discs or CD-ROM; electrical storage media such as 
RAM or ROM; and hybrids of these categories such as magnetic/optical storage 
media. Such media further include paper on which is recorded a representation of 
5 the atomic structure coordinates, e.g., Cartesian coordinates, that can be read by a 
scanning device and converted into a three-dimensional structure with an Optical 
Character Recognition (OCR). 

A variety of data storage structures are available to a skilled artisan for 
creating a computer readable medium having recorded thereon the atomic structure 

10 coordinates of the application or portions thereof and/or X-ray diffraction data. The 
choice of the data storage structure will generally be based on the means chosen to 
access the stored information. In addition, a variety of data processor programs and 
formats can be used to store the sequence and X-ray data information on a computer 
readable medium. Such formats include, but are not limited to, Protein Data Bank 

1 5 ("PDB") format (Research Collaboratory for Structural Bioinformatics; 
http://www.rcsb.Org/pdb/docs/fomat/pdbguide2.2/guide2.2_frame.html); 
Cambridge Crystallographic Data Centre format 
(http://www.ccdc.cam.ac.uk/support/ 

csd_doc/volume3/z323.html); Structure-data ("SD") file format (MDL Information 
20 Systems, Inc.; Dalby et al., 1992, J. Chem. Inf. Comp. Sci. 32:244-255), and line- 
notation, e.g., as used in SMILES (Weininger, 1988, J. Chem. Inf. Comp. Sci. 
28:3 1-36). Methods of converting between various formats read by different 
computer software will be readily apparent to those of skill in the art, e.g., BABEL 
(v. 1.06, Walters & Stahl, © 1992, 1993, 1994; 
25 http://ww.bmnel.ac.uk/departments/chern/babel.htm.) All format representations of 
the polypeptide coordinates described herein, or portions thereof, are contemplated 
by the present invention. By providing computer readable medium having stored 
thereon the atomic coordinates of the application, one of skill in the art can routinely 
access the atomic coordinates of the application, or portions thereof, and related 
30 information for use in modeling and design programs, described in detail below. 
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While Cartesian coordinates are important and convenient representations of 
the three-dimensional structure of a polypeptide, those of skill in the art will readily 
recognize that other representations of the structure are also useful. Therefore, the 
three-dimensional structure of a polypeptide, as discussed herein, includes not only 
5 the Cartesian coordinate representation, but also all alternative representations of the 
three-dimensional distribution of atoms. For example, atomic coordinates may be 
represented as a Z-matrix, wherein a first atom of the protein is chosen, a second 
atom is placed at a defined distance from the first atom, a third atom is placed at a 
defined distance from the second atom so that it makes a defined angle with the first 

10 atom. Each subsequent atom is placed at a defined distance from a previously 

placed atom with a specified angle with respect to the third atom, and at a specified 
torsion angle with respect to a fourth atom. Atomic coordinates may also be 
represented as a Patterson function, wherein all interatomic vectors are drawn and 
are then placed with their tails at the origin. This representation is particularly 

1 5 useful for locating heavy atoms in a unit cell. In addition, atomic coordinates may 
be represented as a series of vectors having magnitude and direction and drawn from 
a chosen origin to each atom in the polypeptide structure. Furthermore, the positions 
of atoms in a three-dimensional structure may be represented as fractions of the unit 
cell (fractional coordinates), or in spherical polar coordinates. 

20 Additional information, such as thermal parameters, which measure the 

motion of each atom in the structure, chain identifiers, which identify the particular 
chain of a multi-chain protein or protein co-complex in which an atom is located, 
and connectivity information, which indicates to which atoms a particular atom is 
bonded, is also useful for representing a three-dimensional molecular structure. 

25 e. Structure of Argonaute 

The present invention provides high-resolution three-dimensional structures 
and atomic structure coordinates of crystalline Argonaute as determined by X-ray 
crystallography. The specific methods used to obtain the structure coordinates are 
provided in the examples and throughout the application. The atomic structure 
30 coordinates of crystalline Argonaute are listed in Table 3 (Figure 25). 
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Those having skill in the art will recognize that atomic structure coordinates 
as determined by X-ray crystallography are not without error. Thus, it is to be 
understood that any set of structure coordinates obtained for crystals of Argonaute, 
whether native crystals, derivative crystals or co-crystals, that have a root mean 
5 square deviation ("r.m.s.d.") of less than or equal to about 1 .5 Angstrom when 

superimposed, using backbone atoms (N, Co, C and O), on the structure coordinates 
listed in Table 3 (Figure 25) are considered to be identical with the structure 
coordinates listed in the Table 3 (Figure 25) when at least about 50% to 100% of the 
backbone atoms of Argonaute are included in the superposition. 

10 II. Crystalline Argonaute 

It is to be understood that the crystalline Ar gonaute of the application are not 
limited to naturally occurring or native Argonaute. Indeed, the crystals of the 
application include crystals of mutants of native Argonaute. Mutants of naturally- 
occurring or native Argonautes are obtained by replacing at least one amino acid 
15 residue in a native Argonaute with a different amino acid residue, or by adding or 
deleting amino acid residues within the native polypeptide or at the N- or C-terminus 
of the native polypeptide, and have substantially the same three-dimensional 
structure as the native Argonaute from which the mutant is derived. 

By having substantially the same three-dimensional structure is meant having 
20 a set of atomic structure coordinates that have a root-mean-square deviation of less 
than or equal to about 2 angstrom when superimposed with the atomic structure 
coordinates of the native Argonaute from which the mutant is derived when at least 
about 50% to 100% of the Ca atoms of the native Argonaute domain are included in 
the superposition. 

25 Amino acid substitutions, deletions and additions which do not significantly 

interfere with the three-dimensional structure of the Argonaute will depend, in part, 
on the region of the Argonaute where the substitution, addition or deletion occurs. 
In highly variable regions of the molecule, non-conservative substitutions as well as 
conservative substitutions may be tolerated without significantly disrupting the 

30 three-dimensional, structure of the molecule. In highly conserved regions, or 
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regions containing significant secondary structure, conservative amino acid 
substitutions are preferred. 

Conservative amino acid substitutions are well known in the art, and include 
substitutions made on the basis of similarity in polarity, charge, solubility, 
5 hydrophobicity, hydrophilicity and/or the amphipathic nature of the amino acid 
residues involved. For example, negatively charged amino acids include aspartic 
acid and glutamic acid; positively charged amino acids include lysine and arginine; 
amino acids with uncharged polar head groups having similar hydrophilicity values 
include the following: leucine, isoleucine, valine; glycine, alanine; asparagine, 
10 glutamine; serine, threonine; phenylalanine, tyrosine. Other conservative amino acid 
substitutions are well known in the art. 

For Argonaute obtained in whole or in part by chemical synthesis, the 
selection of amino acids available for substitution or addition is not limited to the 
genetically encoded amino acids. Indeed, the mutants described herein may contain 
15 non-genetically encoded amino acids. Conservative amino acid substitutions for 

many of the commonly known non-genetically encoded amino acids are well known 
in the art. Conservative substitutions for other amino acids can be determined based 
on their physical properties as compared to the properties of the genetically encoded 
amino acids. 

20 In some instances, it may be particularly advantageous or convenient to 

substitute, delete and/or add amino acid residues to a native Argonaute in order to 
provide convenient cloning sites in cDNA encoding the polypeptide, to aid in 
purification of the polypeptide, and for crystallization of the polypeptide. Such 
substitutions, deletions and/or additions which do not substantially alter the three 

25 dimensional structure of the native Argonaute domain will be apparent to those of 
ordinary skill in the art. 

It should be noted that the mutants contemplated herein need not all exhibit 
Argonaute activity. Indeed, amino acid substitutions, additions or deletions that 
interfere with the Argonaute activity but which do not significantly alter the three- 
30 dimensional structure of the domain are specifically contemplated by the invention. 
30 



WO 2006/015258 



PCT/US2005/027084 



Such crystalline polypeptides, or the atomic structure coordinates obtained 
therefrom, can be used to identify compounds that bind to the native domain. These 
compounds can affect the activity of the native domain. 

The co-crystals of the application generally comprise a crystalline Argonaute 
5 domain polypeptide in association with one or more compounds. The association 
may be covalent or non-covalent. Such compounds include, but are not limited to, 
cofactors, substrates, substrate analogues, modulators, allosteric effectors, etc. 

Argonaute 

As used herein, the term "Argonaut" refers to a protein which (a) mediates 
10 an RNAi response and (b) has an amino acid sequence at least 50 percent identical, 
and more preferably at least 75, 85, 90 or 95 percent identical to SEQ ID NOs.: 1-5. 

Mammals contain four Argonaute 1 subfamily members, Agol-Ago4 
(nomenclature as in (Carmell et al., Genes Dev. 16, 2733 (2002)), see Fig. 22 which 
provides sequence alignment of human Agol-4 proteins, corresponding to SEQ ID 

1 5 NOs: 1 -4). Different Argonaute family members in Drosophila preferentially 
associate with different small RNAs, with Agol preferring miRNAs and Ago2 
siRNAs (24). Recent studies of dmAgol and dmAgo2 mutants have strengthened 
these conclusions (25). To assess whether mammalian Ago proteins specialized in 
their interactions with small RNAs, Ago-associated miRNA populations were 

20 examined by microarray analysis (Example 1). 

Amino Acid Sequence of Pyrococcus furiosus Argonaute Protein: 
SEQ ID NO.: 5 

MKAKVVINLVKiNKiaiPDKIYVYRLFNDPEEELQKEGYSIYRLAYEN 
VGIVIDPENLIIATTKELEYEGEFIPEGEISFSELRNDYQSKLVLRLLKENGIGE 
25 YELSKLLRKFRKPKTFGDYKVIPSVEMSVIKHDEDFYLVIHIIHQIQSMKTLW 
ELVNKDPKELEEFLMTHKENLMLKDIASPLKTVYKPCFEEYTKKPKLDHNQ 
EIVKYWYNYHIERYWNTPEAKLEFYRKFGQVDLKQPAILAKFASKIKKNKN 
YKIYLLPQLWPTYNAEQLESDVAKEILEYTKLMPEERKELLENILAEVDSDI 
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IDKSLSEIEVEKIAQELENKIRVRDDKGNSVPISQLNVQKSQLLLWTNYSRKY 
PVILPYEVPEKFRKIREIPMFIILDSGLLADIQNFATNEFRELVKSMYYSLAKK 
YNSLAK1CARSTNEIGLPFLDFRGKEKVITEDLNSDKGIIEVVEQVSSFMKGKE 
LGLAFDVARNKLSSEKFEEIKPJiLFNLNVISQVVNEDTTXNKRDKYDRNRLD 
5 LFVRHNLLFQVLSKLGVKYYVLDYRFNYDYnGIDVAPMKRSEGYIGGSAV 
MFDSQGYIRKIWIKIGEQRGESVDMNEFFKEMVDKFKEFNIKLDNKKILLLR 
DGRITNNEEEGLKYISEMFDIEVVTMDVIKNHPVRAFANMKMYFNLGGAIY 
LIPHKLKQAKGTPIPIKLAKKRIIKNGKVEKQSITRQDVLDIFILTRLNYGSISA 
DMRLPAPVHYAHKFANAIRNEWKIKEEFLAEGFLYFV 

10 1. Overall Architecture 

This application provides the structure of the full-length Argonaute from the 
archaebacterium Pyrococcus furiosus (PfAgo) as determined by x-ray 
crystallography to 2.25 A resolution. The structure was solved by multiple 
anomalous dispersion (MAD) and isomorphous replacement using selenium and 

1 5 mercury derivatives (Table 2 shown in Figure 24)). The N-terminal, middle, and 
PIWI domains form a crescent-shaped base, with the PIWI domain at the center of 
the crescent. The region following the N-terminus forms a "stalk" that holds the 
PAZ domain above the crescent and an interdomain connector cradles the molecule 
(Fig. 1). This architecture results in a cleft formed at the center of the crescent with 

20 the PAZ domain closing in on this cleft. 

The N-terminal domain consists of a long strand at the bottom of the 
crescent, continuing to a region of a small four-stranded P-sheet, three a-helices and 
a (3-hairpin, which then extends to the three-stranded antiparallel (3-sheet stalk. 

Also provided is the PAZ domain, a globular domain that adopts an OB-like 
25 P-barrel fold with an attachment on one side of the barrel and a cleft in between. 
This cleft was shown to be the binding site for the 2-nucleotide 3 '-overhang of the 
siRNA (29, 32, 33) and is angled towards the crescent. The PAZ domain in PfAgo 
superimposes very well with the PAZ domains from Drosophila Argonaute 1 (30) 
and 2 (29, 31) and with the human Argonaute-1 (hAgol) PAZ domain in complex 
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with a "mini-siRNA" (33), though the attachment in the archael protein has two cc- 
helices rather than an a-helix and a (3-hairpin (Figs. 2A and 2B). 

The middle domain, which is located at one end of the crescent, is an oc/(3 
open sheet domain composed of a central three-stranded parallel p-sheet surrounded 
5 by a-helices. This domain is similar to the glucose-galactose-arabinose-ribose 

binding protein family and is most similar to Lac repressor (35). The middle domain 
also has small three-stranded (3-sheet on the outer surface of the crescent, connecting 
it to the rest of the molecule. 

Further provided is the PIWI domain, which is at the C-terminus of 
10 Argonaute (residues 545-770). It sits in the middle of the crescent and below the 
PAZ domain. The crystal structure reveals the presence of a prominent central five- 
stranded P-sheet flanked on both sides by a-helices at the core of the PIWI domain. 
A smaller P-sheet extends from the central p-sheet and attaches PIWI to the N- 
terminal domain and to portions of the interdomain connector. 

15 2. Domain Structure 

As mentioned above, the PAZ domain superimposes very well with all the 
other PAZ domains with known structures, namely, Drosophila Argonautes 1 and 2 
andhAgol (Fig. 2A). Most of the differences lie in loop regions. The root-mean- 
square deviation (rmsd) between hAgol-PAZ and the PAZ domain in this structure 

20 is approximately 1 .4 A (for 53 Ca's). Though it is now possible to align the 

sequence of the PAZ domain of PfAgo with PAZ domains from Argonaute proteins 
of higher eukaryotes (Fig. 2B) based on the structures, homologies between the 
archeal and eukaryotic PAZ domains was not apparent before the PfAgo structure 
was determined. In fact, primary sequence comparisons provided no evidence that 

25 PfAgo contained a PAZ domain. Even after attempting to align the sequences with 
reference to the three-dimensional structures, the sequence identity remains below 
10%. The presence and location of the PIWI domain was, on the other hand, 
obvious from the primary sequence, and could be readily identified through BLAST 
searches. 
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The role of the PAZ domain, as shown for fly Ago-2 (29, 32) and for hAgo-1 
(33) is to bind the 2-nucleotide 3' overhang of the siRNA. Importantly, the 
conserved aromatic residues that fill the cleft and were shown to bind those 
nucleotides (29, 32, 33) are all present in the PfAgo PAZ domain. Curiously, in 
5 some cases, these side chains occupy similar positions in space even if they aren't 
anchored to positions on the peptide backbone corresponding to those in eukaryotic 
proteins. Specifically, Y212, Y216, H217 and Y190 are equivalent to Y309, Y314, 
H269 and Y277 of hAgol that were shown to bind the oxygens of the phosphate that 
links the two bases in the overhang. Residue Y190 of PfAgo superimposes perfectly 

1 0 on hAgo 1 -Y277 that was also shown to bind the 2'-hydroxyl of the penultimate 
nucleotide. Residues L263 and 1261 can assume the role of L337 and T335, which 
anchor the sugar ring of the terminal residue through van der Waals interactions in 
the liAgol-RNA structure. There is an aromatic residue, F292 in hAgol that stacks 
against the terminal nucleotide. This position is occupied by another aromatic, 

15 W213, in PfAgo. Finally, R220 in the structure of the present application is 
positioned similarly to K3 13 that contacts the penultimate nucleotide. As for 
residues that were shown to bind the region of the RNA strand 5' to the overhang, 
K19 1 is positioned as R278 in hAgol to bind phosphates and Y259 is equivalent to 
K333. Other PAZ residues, such as K252, K248, Q276 and N176 are probably used 

20 to bind that strand as well. Accordingly, the PAZ domain in PfAgo appears to have 
a similar function to the PAZ domains of the fly and human Argonautes and would 
also be capable of binding a 3' single-stranded region of an RNA molecule. 

The present application also provides a PIWI domain core having a tertiary 
structure that belongs to the RNase H family of enzymes, which include RNase H 

25 type 1 and type 2 enzymes. This fold is also characteristic of other enzymes with 
nuclease or polynucleotidyl transferase activities, such as HIV and AS V integrases 
(36, 37), RuvC (38), a Holliday junction endonuclease, and transposases such as Mu 
(39) and Tn5 (40). The closest matches, however, are with RNase HII (41) and 
RNase HI (42). The rmsd's between these proteins and PfAgo are of 1 .9 A and they 

30 are topologically identical (Fig. 3A). RNase H fold proteins all have a five-stranded 
mixed |3-sheet surrounded by helices. In the RNase H enzymes as well as PIWI, 
there are two helices on either side of the p-sheet. On one side these are very 
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similar, and on the other, one of the helices varies. PIWI has an insertion between 
the last strand and the last helix of the RNase H fold. This insertion consists of a 
smaller p-sheet attachment and a helix that links it to the rest of the protein. RNase 
HII has a cap domain that sits above the active site cleft and forms a groove for 
5 substrate binding (43). In addition, several residues from the cap domain appear to 
participate in substrate recognition. The positioning of the cap relative to the RNase 
H fold of the protein is approximately the same as the PAZ domain relative to the 
PIWI domain in Argonaute. 

Similarity is not restricted to the protein fold. In all of these enzymes there 

1 0 are three highly conserved carboxylates which are essential for catalytic activity 

(44). Two of these carboxylate side chains are always located on the first strand, (31, 
which is the central strand of the p-sheet, and at the C-terminus of the fourth strand, 
P4, of the RNase H fold, which is adjacent to pi (the red and green strands in Figs. 
3 A and 3B). The position of the third carboxylate varies between the different 

1 5 RNase H fold enzymes. Remarkably, when examining a superposition between 
either RNase HI or RNase HII and PIWI, two aspartate residues were located at the 
same positions as the invariant carboxylates of the RNase H fold (Fig. 3B). These 
are D558 located on the first b-strand of PIWI and D628 located at the end of the 
fourth strand of the PIWI domain. These aspartates are equivalent to D10 and D70 

20 in E. coli RNase HI, D7 and Dl 12 in Methanococcus jannaschii RNase HII, and D6 
and D10 1 in Archaeoglobus fulgidus RNase HII. The location of the third 
carboxylate, a glutamate, in RNase HI and HII is occupied by a valine in Argonaute. 
However, a glutamate, E635, is in close proximity to the two aspartates, and this 
glutamate may serve as the third active site residue. This residue is positioned on 

25 the second helix of the RNase H fold of PIWI (the blue helix in Figs. 3 A-3B). Since 
the position of the third carboxylate varies in these proteins, the only requirement 
would be for a reasonable spatial position at the active site, a criterion which E635 
meets. Therefore, the active site of PfAgo is likely composed of the carboxylate 
triad formed by D528, D628 and E635 that make up the "DDE" motif. Interestingly, 

30 an arginine, R627, is also positioned at the center of the active site, as in the case of 
the IS4 family of transposases such as Tn5 which appear to have a "DDRE motif 
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(40, 45). The active site is thus positioned in a cleft in the middle of the crescent in 
the groove below the PAZ domain. 

RNase H enzymes as well as other polynucleotidyl transferase enzymes 
require the presence of divalent metal ions for activity. However, the precise role of 
5 the metal ions remains unclear. Both one and two metal ion mechanisms have been 
proposed. E. coli RNase HI is thought to work via a one-metal ion mechanism in 
which Mg 2+ , coordinated by one carboxylate group, mediates interactions with the 
nucleic acid substrate. The other two carboxylates activate a water molecule that 
can then attack the scissile phosphate bond (46, 47). The two-metal ion mechanism 

10 was first proposed for the 3 ' to 5' exonuclease of the Klenow fragment (48, 49). In 
this case, one metal interacts with the substrate and stabilizes the reaction 
intermediate and the other activates a water molecule and positions it to attack the 
scissile phosphate. Indeed, only one metal is observed in the crystal structures of E. 
coli RNase HI (42) and A. fulgidus RNase HII (43) while two are seen in the active 

1 5 site of the isolated HIV RNase H domain of reverse transcriptase (50). Though the 
absence of a second metal ion in a crystal structure does not preclude a two-metal 
ion mechanism (since the second metal may have weak binding in the absence of 
substrates) there are indications that RNase HI does use a single-metal ion 
mechanism while HIV RNase H uses two (51). For the PIWI domain of PfAgo, a 

20 strong peak is identified in the F obs -F ca i c difference electron density map near D5 5 8, 
and it is assigned as a water molecule at this time. By growing crystals in the 
presence of divalent metal ions, this may be assigned as a metal site unambiguously. 
A divalent metal ion appears to be required for Argonaute activity (52, 53). 

3. siRNA Binding 

25 The role of Argonaute is presently unknown in archaebacteria. Because of 

its similarity to Argonautes in eukaryotes, the siRNA binding characteristics of 
PfAgo were examined by using crosslinking and competition assays. A single- 
stranded 21-mer siRNA containing an IodoU nucleotide to facilitate crosslinking 
gave rise to a crosslinked species, whereas a double-stranded siRNA did not (Fig. 

30 4A). In addition, the same labeled ss-siRNA can be readily competed off with an 
identical unlabeled oligonucleotide. However, a similar ss-siRNA lacking the 5'- 
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phosphate moiety was unable to compete for crosslinking, even at greater than ten- 
fold the concentration than that at which competition with the 5'-phosphorylated ss- 
siRNA was seen (Fig. 4B). Thus, there appears to be a requirement for a bona fide 
siRNA for binding. Preferential binding of the ss-siRNA over the ds-version is 
5 consistent with the observation that a ds-siRNA cannot be loaded in vitro to an RISC 
complex, though an ss-siRNA can be. Accordingly, the present application provides 
an RISC complex comprising an RNAi construct, e.g., an ss-siRNA. The RISC 
complex preferably comprises an Argonaute protein, most preferably, an Argonaute 
protein with the "sheer" activity, described in greater detail below. 

10 4. "Sheer" Activity 

The finding that the PIWI domain in Argonaute is an RNase H domain 
suggests Argonaute as the, as of yet unidentified, "Sheer" enzyme of RISC, that is, 
the enzyme that cleaves the mRNA. RNase H enzymes specialize in single-stranded 
cleavage of RNA "guided" by a DNA strand in a double-stranded RNA/DNA 

1 5 hybrid. In a similar manner, Argonautes may specialize in RNA cleavage, in 

particular mRNA, guided by the siRNA strand in a ds RNA substrate. Moreover, 
unlike most RNases that leave a 3 '-phosphate and 5'-OH, RNase H enzymes 
produce products with 3'-OH and 5' phosphate groups (54). Recently, Martinez and 
Tuschl, and Zamore and colleagues showed that cleavage of the mRNA by RISC 

20 produces the latter type of termini (52, 53). A dependence on Mg 2+ for activity is 
another hallmark of RNase H enzymes and RISC was also shown to require Mg 2+ 
for cleavage as well (52). The PAZ domain, shown to recognize and bind the 3' 
ends of siRNAs, and the PIWI domain, now shown to be an RNase H domain for 
catalytic activity, combine the necessary features of the slicing component of the 

25 RNAi machinery. Therefore, Argonaute, the signature component of RISC, can be 
"Sheer" itself. 

5. A Model for si-RNA-Guided mRNA Cleavage 

The placement of the PAZ domain on top of the crescent formed by the N- 
terminal, middle and PIWI domains and cradled by the connecter region in the 
30 structure of Argonaute defines a distinct groove through the protein. The groove has 
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a claw shape that bends around between the PAZ and N-terminal domains. A 
striking feature of the structure is evident when the electrostatic potential is mapped 
on the surface of the protein. As shown in Fig. 5 A, the surface of this inner groove 
is completely lined with positive charges. These positive charges are of course 
5 suitable for interaction with the negatively charged phosphate backbone and with the 
2'-hydroxyl moieties of an RNA molecule, implicating the groove for substrate 
binding. The substrate for Argonaute is a ds-RNA molecule composed of an ss- 
siRNA acting as a guide and the mRNA. 

In order to examine possible substrate binding modes for Argonaute, the 

10 knowledge of siRNA binding to the PAZ domain using the known PAZ-RNA 
structure (33) and the mode of binding of RNase H substrates (43, 55-57) were 
combined. Since the PAZ domain of PfAgo superimposes so well with the PAZ 
domain of hAgol in the PAZ-RNA complex as shown above, the two PAZ domains 
were superimposed and examined for the resulting position of the RNA with respect 

15 to PfAgo. The strand that interacts with its 3 ' end in the PAZ cleft was regarded as 
the siRNA guide. The second strand would then be regarded as the mRNA substrate 
strand (see Fig. 5B). The siRNA guide has its 2 nucleotides at its 3' end inserted 
into the PAZ cleft. The nucleotides just 5' to that track the top of the PAZ b-barrel 
making very similar, if not identical, interactions with the PAZ domain as in the 

20 crystal structure of the PAZ-RNA complex. A long loop present in the PfAgo PAZ 
domain would probably move up slightly to accommodate the siRNA. Upon 
examination of the resulting location of the passenger strand, the mRNA would be 
coming into the binding groove with its 5' end between the PAZ and the N- terminal 
domains. The N-terminus then acts as an "mRNA grip" on that end of the molecule. 

25 It should be noted that there is another extension of the groove that lies between the 
N-terminal and the PIWI domains, which could accommodate a single-stranded 
nucleic acid. 

The double-stranded RNA was further extended into the molecule along the 
binding groove by model building. Remarkably, the mRNA would be positioned 
30 above the active site located in the PIWT domain 9 nucleotides from the 5 '-side end 
of the double-stranded region, or rather 1 1 nucleotides if the 2 nucleotides of the 
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guide that are inserted into the PAZ domain are counted and are probably not 
interacting with the mRNA. In other words, the scissile bond would be predicted to 
be between nucleotides 1 1 and 12 from the 5' end of the message or from the 3 '-end 
of the guide. This precisely coincides with the demonstrated cleavage of mRNAs by 
5 RISC 10 nucleotides from the 5' end of an siRNA. The remainder of the RNA 

would then continue along the binding groove (Fig. 5C). The interdomain connecter 
is also forming part of the back wall of the binding groove. As the RNA molecule 
would have to bend somewhat, the details of some of these interactions are not clear. 
However, the length of the groove appears to accommodate the length of the siRNA 

1 0 guide, with the 5 ' end of the guide probably interacting with the other side of the 
groove. From studies of other RNase H enzymes, Argonaute may sense the minor 
groove width of the dsRNA, which is different from that of dsDNA and from the 
minor groove width of a RNA/DNA hybrid, and which is in accord with the inability 
of RISC to cut DNA substrates (53). This mode of recognition would be in addition 

1 5 to binding the 3 ' end of the siRNA and sensing the phosphate at the 5 'end, as shown 
in the binding experiments (Fig. 4). 

The groove as observed in the crystal structure presented here, in the absence 
of substrate, would fit an A-RNA double helix snugly. Though a single-stranded 
RNA should bind fairly readily, opening the claw of the molecule somewhat might 
20 assist binding the mRNA, after which it can close down on the double stranded 
substrate. A hinge region may exist in the interdomain connector at residues 317- 
320. This hinge could lift the PAZ and the away from the crescent base. This is 
reasonable since a RISC loading complex appears to be required for assembling an 
active RISC (58, 59). 

25 The notion that RISC "Sheer" activity, i.e. siRNA-guided mRNA cleavage, 

resides in Argonaute itself was tested in a mammalian system where the RNAi 
pathway is known to function. It appears that mammalian Argonaute proteins are 
distinct and that Ago2 is functional for mRNA cleavage. Based on the sequence 
alignment with the archael protein, D597, D669 and a third amino acid (e.g., E683) 

30 of hAgo2 correspond to D558, D628 and E635 of PfAgo to form the catalytic triad 
"DDE" motif. There is an insertion near E683, and E673 may also act as the third 
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carboxylate in hAgo2. The conserved active site aspartates were mutated and the 
mutants lost their nuclease activity while retaining binding to the siRNA guide. 
Therefore, Argonaute itself functions as the Slicer enzyme in the RNAi pathway. 

In siRNA-guided mRNA cleavage, once RISC is formed, it needs to identify 
5 its homologous targets, both for target cleavage and for repression at the level of 
protein synthesis. In the latter case, there is a presumably stable interaction that 
occurs between the siRNA and its target, with the target being somehow protected 
from cleavage. Certainly, an absence of base pairing in the region of the active site 
might distort the complex sufficiently to prevent catalysis. 

1 0 Furthermore, several Argonaute protein family members appear to be 

inactive towards mRNA cleavage despite the presence of the catalytic residues. The 
basis for these differences may help elucidate the details of the mechanism for 
siRNA-guided mRNA cleavage. The situation here might be somewhat analogous 
to the case of the transposase Tn5 and its inhibitor, which posses a catalytic domain 

1 5 with a similar RNase H-like fold. Tn5 inhibitor is a truncated version of the active 
Tn5 transposase and retains the essential catalytic residues. However, there are 
major conformational differences between the two that result in domains of the 
proteins being in different positions relative to one another (40, 45). Similarly, 
mutations have been introduced into a catalytically active Ago protein, hAgo2, in 

20 the vicinity of the active site, which change residues to corresponding residues in an 
inactive Ago, hAgol . These inactivate Ago2 for cleavage, indicating that there are 
determinants for catalysis beyond simply the catalytic triad and that relatively minor 
alterations in the PIWI domain can have profound effects on its activity toward RNA 
substrates. The common fold in the catalytic domain of Argonaute family members 

25 and transposases and integrases is also intriguing given the relationship of RNAi 

with control of transposition. It is worth noting that the identification of the catalytic 
center of RISC awaited a drive toward understanding RNAi at a structural level. 
Thus, it seems likely that, as in the present example, a full understanding of the 
underlying mechanism of RNAi will derive from a combination of detailed 

30 biochemical and structural studies of RISC. 
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Assays 

The assays and methods described herein may used in combination or 
separately. For example, an in silico screening and an in vitro binding assay and/or 
an activity assay may be combined to identify a binding agent and/or a binding agent 
5 for a protein that also modulates activity of the protein. 

I. Assays Based on the Atomic Structure Coordinates 

Structural information, often in the form of atomic structure coordinates, 
may also be used in a variety of molecular modeling and computer-based screening 
applications to, for example, design variants that have altered biological properties 
10 or to computationally design, screen for and/or identify compounds that bind to the 
Argonaute protein or to fragments of the Argonaute protein. These compounds may 
modulate the activity of Argonaute protein and hence the RISC activity. 

Thus, in a further aspect of the application, the data from the crystal structure 
of Argonaute is used to evaluate compounds for their utility as modulators of 
1 5 Argonuate protein. These methods comprise designing and synthesizing candidate 
compounds using the atomic coordinates of the three dimensional structure of such 
co-crystals and screening for its utility in various pharmaceutical applications. 

In another embodiment, the structures are probed with a plurality of 
molecules to determine their ability to bind to the Argonaute protein at various sites. 
20 Such molecules may be able to modulate the activity of Argonaute protein. 

In yet another embodiment, the structures can be used to computationally 
screen small molecule databases for chemical entities or compounds that can bind in 
whole, or in part, to Argonaute. In this screening, the quality of fit of such entities 
or compounds to the binding site may be judged either by shape complementarity or 
25 by estimated interaction energy. (Meng et al., 1 992, J. Comp. Chem. 1 3 :505-524). 

The design of compounds that bind to Argonaute according to this invention 
generally involves consideration of two factors. First, the compound must be 
capable of physically and structurally associating with Argonaute. This association 
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can be covalent or non-covalent. For example, covalent interactions may be 
important for designing suicide or irreversible inhibitors of a protein. Non-covalent 
molecular interactions important in the association of Argonaute include hydrogen 
bonding, ionic and other polar interactions, interactions as well as van der Waals 
5 interactions. Second, the compound must be able to assume a conformation that 
allows it to associate with the Argonaute protein. Although certain portions of the 
compound will not directly participate in this association with the protein, those 
portions may still influence the overall conformation of the molecule. This, in turn, 
may have a significant impact on potency. Such conformational requirements 
10 include the overall three-dimensional structure and orientation of the chemical group 
or compound in relation to all or a portion of the binding site, or the spacing between 
functional groups of a compound comprising several chemical groups that directly 
interact with the protein. 

The potential modulatory or binding effect of a chemical compound on 
1 5 Argonaute may be analyzed prior to its actual synthesis and testing by the use of 
computer modeling techniques. If the theoretical structure of the given compound 
suggests insufficient interaction and association between it and the protein, synthesis 
and testing of the compound is unnecessary. However, if computer modeling 
indicates a strong interaction, the molecule may then be synthesized and tested for 
20 its ability to bind to the protein and inhibit its activity. In this manner, synthesis of 
ineffective compounds may be avoided. 

A binding compound of Argonaute may be computationally evaluated and 
designed by means of a series of steps in which chemical groups or fragments are 
screened and selected for tiieir ability to associate with the individual binding 
25 pockets or interface surfaces of each of the proteins. One skilled in the art may use 
one of several methods to screen chemical groups or fragments for their ability to 
associate with Argonaute. Docking may be accomplished using software such as 
QUANTA and SYBYL, followed by energy minimization and molecular dynamics 
with standard molecular mechanics force fields, such as CHARMM and AMBER. 

30 Specialized computer programs may also assist in the process of selecting 

fragments or chemical groups. These include: 
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1. GRID (Goodford, 1985, J. Med. Chem. 28:849-857). GRID is available 
from Oxford University, Oxford, UK; 

2. MCSS (Miranker & Karplus, 1991, Proteins: Structure, Function and 
Genetics 11:29-34). MCSS is available from Molecular Simulations, Burlington, 

5 Mass.; 

3. AUTODOCK (Goodsell & Olsen, 1990, Proteins: Structure, Function, and 
Genetics 8:195-202). AUTODOCK is available from Scripps Research Institute, La 
Jolla, Calif.; 

4. DOCK (Kuntz et al., 1982, J. Mol. Biol. 161:269-288). DOCK is available 
10 from University of California, San Francisco, Calif.; 

5. FlexE (Clausen H, Buning C, Rarey M and Lengauer T) J. Mol. Biol. 
(2001) 308, 377-395. FlexE is available from Tripos, St. Louis, Mo.; 

6. Glide, Glide is available from Schrodinger, Portland, Oreg.; 

7. Gold, Jones et al. J. Mol. Biol. 245, 43-53, 1995; 

15 8. QXP,McMartinC,BohacekRS. JComput AidedMolDes 1997 11:333- 

44; 

9. ICM. (http://www.molsoft.com). Available from Molsoft, San Diego, 
Calif.; and 

10. FlexX. [Sybl, Tripos, St. Louis, Mo 

20 Once suitable chemical groups or fragments have been selected, they can be 

assembled into a single compound. Assembly may proceed by visual inspection of 
the relationship of the fragments to each other in the three-dimensional image 
displayed on a computer screen in relation to the structure coordinates of Argonaute. 
This would be followed by manual model building using software such as 

25 QUANTA or SYBYL. 
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Useful programs to aid one of skill in the art in connecting the individual 
chemical groups or fragments include: 

1. CAVEAT (Bartlett et al., 1989, 'CAVEAT: A Program to Facilitate the 
Structure-Derived Design of Biologically Active Molecules. ' In Molecular 

5 Recognition in Chemical and Biological Problems', Special Pub., Royal Chem. Soc. 
78:182-196). CAVEAT is available from the University of California, Berkeley, 
Calif.; 

2. 3D Database systems such as MACCS-3D (MDL Information Systems, 
San Leandro, Calif.). This area is reviewed in Martin, 1992, J . Med. Chem. 

10 35:2145-2154); and 

3. HOOK (available from Molecular Simulations, Burlington, Mass.). 

Instead of proceeding to build a modulator of Argonaute in a step-wise 
fashion one fragment or chemical group at a time, as described above, Argonaute- 
binding compounds or modulators may be designed as a whole or 'de novo' using 
1 5 either an empty binding site or the surface of a protein that participates in 

protein/protein interactions in a co-complex, or optionally including some portion(s) 
of a known modulators). These methods include: 

1. LUDI (Bohm, 1992, J. Comp. Aid. Molec. Design 6:61-78). LUDI is 
available from Molecular Simulations, Inc., San Diego, Calif.; 

20 2. LEGEND (Nishibata & Itai, 1991, Tetrahedron 47:8985). LEGEND is 

available from Molecular Simulations, Burlington, Mass.; and 

3. LeapFrog (available from Tripos, Inc., St. Louis, Mo.). 

Other molecular modeling techniques may also be employed in accordance 
with this invention. See, e.g., Cohen et al., 1990, J. Med. Chem. 33:883-894. See 
25 also, Navia & Murcko, 1992, Current Opinions in Structural Biology 2:202-210. 

Once a compound has been designed or selected by the above methods, the 
efficiency with which that compound may bind to Argonaute may be tested and 
44 



WO 2006/015258 



PCT/US2005/027084 



optimized by computational evaluation. An effective modulator of Argonaute must 
preferably demonstrate a relatively small difference in energy between its bound and 
free states (i.e., it must have a small deformation energy of binding). Thus, the most 
efficient modulators should preferably be designed with a deformation energy of 
5 binding of not greater than about 10 kcal/mol, preferably, not greater than 7 

kcal/mol. Modulators may interact with the protein in more than one conformation 
that is similar in overall binding energy. In those cases, the deformation energy of 
binding is taken to be the difference between the energy of the free compound and 
the average energy of the conformations observed when the modulator binds to the 
10 protein. 

A compound selected or designed for binding to or inhibiting Argonaute may 
be further computationally optimized so that in its bound state it would preferably 
lack repulsive electrostatic interaction with the target protein. Such non- 
complementary electrostatic interactions include repulsive charge-charge, dipole- 
15 dipole and charge-dipole interactions. Specifically, the sum of all electrostatic 

interactions between the modulator and the protein when the modulator is bound to 
it preferably make a neutral or favorable contribution to the enthalpy of binding. 

Specific computer software is available in the art to evaluate compound 
deformation energy and electrostatic interaction. Examples of programs designed 

20 for such uses include: Gaussian 92, revision C (Frisch, Gaussian, Inc., Pittsburgh, 
Pa. ©1992); AMBER, version 4.0 (Kollman, University of California at San 
Francisco, ©1994); QUANTA/CHARMM (Molecular Simulations, Inc., Burlington, 
Mass., ©1994); and Insight II/Discover (Biosym Technologies Inc., San Diego, 
Calif., ©1994). These programs may be implemented, for instance, using a 

25 computer workstation, as are well-known in the art. Other hardware systems and 
software packages will be known to those skilled in the art. 

The computer-assisted methods for designing a modulator of Argonaute 
activity can be de novo or based on a candidate compound. An example of a 
computer-assisted method for designing an modulator of Argonaute activity de novo 
30 would thus involve the steps of: (1) supplying a computer modeling application with 
a set of structure coordinates of a molecule or molecular complex comprising at least 
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a portion of an Argonaute; (2) computationally building a chemical entity 
represented by a set of structure coordinates; and (3) determining whether the 
chemical entity is an modulator expected to bind to or interfere with the molecule or 
molecular complex, wherein binding to or interfering with the molecule or 
5 molecular complex is indicative of potential modulation of Aargonaute activity. 

Once an modulator or Argonaute binding compound has been optimally 
selected or designed, as described above, substitutions may then be made in some of 
its atoms or chemical groups in order to improve or modify its binding properties. 
Generally, initial substitutions are conservative, i.e., the replacement group will have 
10 approximately the same size, shape, hydrophobicity and charge as the original 

group. One of skill in the art will understand that substitutions known in the art to 
alter conformation should be avoided. Such altered chemical compounds may then 
be analyzed for efficiency of binding to Argonaute by the same computer methods 
described in detail above. 

1 5 An example of such a computer-assisted method for identifying an 

modulator of Argonaute activity would thus involve (1) supplying a computer 
modeling application with a set of structure coordinates of a molecule or molecular 
complex comprising at least a portion of an Argonaute or Argonaute-like compound, 
(2) supplying the computer modeling application with a set of structure coordinates 

20 of a chemical entity; and (3) determining whether the chemical entity is an 

modulator expected to bind to or modulate the molecule or molecular complex. 

The structure coordinates of an Argonaute co-complex, or of Argonaute 
alone, or of portions thereof, are particularly useful to solve the structure of other co- 
complexes of Argonaute, of mutants, of the Argonaute co-complex further 
25 complexed to another molecule, or of the crystalline form of any other protein or 
protein co-complex with significant amino acid sequence homology to any 
functional domain of Argonaute. 

One method that may be employed for this purpose is molecular 
replacement. In this method, the unknown co-crystal structure, whether it is another 
30 Argonaute co-complex, a mutant, a Argonaute co-complex that is further complexed 
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to another molecule, or the crystal of some other protein or protein co-complex with 
significant amino acid sequence homology to any functional domain of one of the 
proteins in the co-complex crystal, may be determined using phase information from 
the present Argonaute co-complex structure coordinates. This method will provide 
5 an accurate three-dimensional structure for the unknown protein or protein co- 
complex in the new crystal more quickly and efficiently than attempting to 
determine such information ab initio. 

If an unknown crystal form has the same space group as and similar cell 
dimensions to the known co-complex crystal form, then the phases derived from the 

10 known crystal form can be directly applied to the unknown crystal form, and in turn, 
an electron density map for the unknown crystal form can be calculated. Difference 
electron density maps can then be used to examine the differences between the 
unknown crystal form and the known crystal form. A difference electron density 
map is a subtraction of one electron density map, e.g., that derived from the known 

1 5 crystal form, from another electron density map, e.g., that derived from the unknown 
crystal form. Therefore, all similar features of the two electron density maps are 
eliminated in the subtraction and only the differences between the two structures 
remain. However, if the space groups and/or cell dimensions of the two crystal 
forms are different, then this approach will not work and molecular replacement 

20 must be used in order to derive phases for the unknown crystal form. 

The techniques of X-ray diffraction can be employed in the study of the co- 
complexes of Argonaute. This information may thus be used to optimize known 
modulators of Argonaute and more importantly, to design and synthesize novel 
classes of modulators of Argonaute. 

25 Subsets of the atomic structure coordinates can also be used in any of the 

above methods. Particularly useful subsets of the coordinates include, but are not 
limited to, coordinates of single domains, coordinates of residues lining an active 
site, coordinates of residues that participate in important protein-protein contacts at 
an interface, and Co; coordinates. For example, the coordinates of one domain of a 

30 protein that contains the active site may be used to design modulators that bind to 
that site, even though the protein is fully described by a larger set of atomic 
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coordinates. Therefore, as described in detail for the specific embodiments, below, a 
set of atomic coordinates that define the entire polypeptide chain, although useful for 
many applications, do not necessarily need to be used for the methods described 
herein. 

5 II. Assay for Argonaute RNase Activity 

The present application provides screening methods for agents that modulate 
the RNase activity of the Argonaute protein. Applicants have shown that Argonaute 
has a RNase H domain and acts as the Sheer enzyme of RISC to cleave mRNA 
bound by a single-stranded siRNA. Thus, the Argonaute activity can be assayed by 
10 measuring by any standard techniques in the art for measuring RNase activity. The 
exemplification provides one such example. 

In certain embodiments, the RNase H activity of Argonaute can be measured. 
For example, WO 04/59012 describes a "Molecular Beacon" Assay for measuring 
RNase H activity and/or other nuclease-mediated cleavage of nucleic acids. Briefly, 

15 the assay detects degradation of a nucleic acid substrate which, preferably, is an 
RNA substrate that is annealed to at least one region or part of an oligonucleotide 
probe. In preferred embodiments, the oligonucleotide probe is a DNA probe (e.g., a 
deoxyoligonucleotide probe), which may also be referred to in the context of this 
invention as the DNA "substrate" moiety. Typically, both the oligonucleotide probe 

20 and the RNA substrate will be oligonucleotide molecules that are between about 10 
and about 100 nucleotides in length and may be, e.g., between about 1050 
nucleotides in length, more preferably between 15-25 nucleotides length. In 
preferred embodiments, the oligonucleotide probe is at least 18 nucleotides in 
length. 

25 Chan et al. describes a capillary electrophoretic assay to measure RNase H 

activity. See Anal Biochem. 2004 Aug 15;331(2):296-302. Briefly, cleavage of a 
fluorescein-labeled RNA-DNA heteroduplex was monitored by capillary 
electrophoresis. This assay was used as a secondary assay to confirm hits from a 
high-throughput screening program. Since autofluorescent compounds in samples 

30 migrated differently from both substrate and product in most cases, the assay was 
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extremely robust for assaying enzymatic inhibition of such samples, in contrast to a 
simple well-based approach. 

The screening methods may be conducted in a high-throughput fashion using 
any techniques available in the art. Recently, Parniak et al. described a 
5 fluorescence-based high-throughput screening assay for inhibitors of HIV RNase H 
activity. See Anal Biochem 2003, 322:33-9. Briefly, the assay substrate is an 1 8- 
nucleotide S'-fluorescein-labeled RNA annealed to a complementary 18-nucleotide 
5'-Dabcyl-modified DNA. The intact duplex has an extremely low background 
fluorescent signal and provides up to 50-fold fluorescent signal enhancement 

10 following hydrolysis. The size and sequence of the duplex are such that HIV-l RT- 
RNase H cuts the RNA strand close to the 3' end. The fluorescein-labeled 
ribonucleotide fragment readily dissociates from the complementary DNA at room 
temperature with immediate generation of a fluorescent signal. This assay is rapid, 
inexpensive, and robust, providing Z' factors of 0.8 and coefficients of variation of 

1 5 about 5%. The assay can be carried out both in real-time (continuous) and in 

"quench" modes; the latter requires only two addition steps with no washing and is 
thus suitable for robotic operation. Several chemical libraries totaling more than 
106,000 compounds were screened with this assay in approximately 1 month. 

Alternatively, McLellan et al. described a nonradioactive, 96-well plate assay 
20 designed to be used for high-throughput screening of compounds capable of 

inhibiting the RNase H activity of HIV-l reverse transcriptase. See McLellan at al., 
Biotechniques. 2002 Aug;33(2):424-9. In this method, tRNA is employed as 
substrate that was labeled with digoxygenin-modified reporter residues. The labeled 
tRNA was prehybridized with a DNA oligonucleotide that contained a single 
25 biotinylated residue at its 5'-terminus to ensure its attachment to streptavidin-coated 
microplates. The uncleaved, immobilized DNA/tRNA substrate was detected 
through the use of established ELISA protocols. Incubation with purified HIV-l 
reverse transcriptase initiated RNase H degradation and caused a signal reduction to 
negligible background levels. In contrast, the signal intensity remained unaffected 
30 when using an RNase H deficient mutant enzyme. The assay was validated using 
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the hydrazone derivative BBNH that was previously shown to inhibit RNase H 
degradation below concentrations of 10 microM. 

III. Reporter Gene Assay 

The application also provides reporter gene assays. Hie reporter gene assays 
5 may be used to identify agents that modulate (e.g., increase) expression of 
Argonaute gene(s), e.g., by modulating Argonaute's promoter activity. For 
example, by operably linking an Argonaute's promoter with a reporter gene, the 
activity of the promoter can be monitored through monitoring/measuring the 
expression level of reporter gene. Many reporter gene assays have been developed 
10 and known to skilled artisans. Examples include: (S-galactosidase assays; j3- 

glucuronidase assays; B-lactamase assays (kits, /Macatamase FRET substrates or 
color substrates are commercially available); CAT assays; Dual Reporter assays; 
GFP Assays; Luciferase Assays; SEAP Assays. 

IV. Binding Assay 

1 5 As described above, in silico screening or assays may be developed to 

identify a ligand or an inhibitor of interest, such as a ligand or an inhibitor that 
interacts with an Argonaute protein, e.g., a hAgo-2 protein. A ligand generally 
refers to a molecule (e.g., a nucleic acid molecule or a non-nucleic acid small 
molecule) that binds a molecule of interest (e.g., an Argonaute protein of the 

20 application). An inhibitor generally refers to a molecule that inhibits the function or 
activity of its target molecule, e.g., an Argonaute protein of the application. 

A variety of assay formats will suffice and, in light of the present disclosure, 
those not expressly described herein will nevertheless be comprehended by one of 
ordinary skill in the art. Assay formats which approximate such conditions as 
25 formation of protein-based complexes and enzymatic activity may be generated in 
many different forms, and include assays based on cell-free systems, e.g., purified 
proteins or cell lysates, as well as cell-based assays which utilize intact cells. 
Simple binding assays can also be used to detect agents which bind to a protein of 
the application. Agents to be tested can be produced, for example, by bacteria, yeast 
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or other organisms (e.g., natural products), produced chemically (e.g., small 
molecules, including peptidomimetics), or produced recombinantly. In a preferred 
embodiment, the test agent is a small organic molecule, e.g., other than a peptide or 
oligonucleotide, having a molecular weight of less than about 6,000 daltons. 

5 In many drug screening programs which test libraries of compounds and 

natural extracts, high throughput assays are desirable in order to maximize the 
number of compounds surveyed in a given period of time. Assays of the present 
application which are performed in cell-free systems, such as may be developed with 
purified or semi-purified proteins or with lysates, are often preferred as "primary" 

10 screens in that they can be generated to permit rapid development and relatively easy 
detection of an alteration in a molecular target which is mediated by a test 
compound. Moreover, the effects of cellular toxicity and/or bioavailability of the 
test compound can be generally ignored in the in vitro system, the assay instead 
being focused primarily on the effect of the drug on the molecular target as may be 

1 5 manifest in the affinity of the drug to the molecular target and/or changes in 
enzymatic properties of the molecular target. 

In certain embodiments, an Argonaute protein to be used in a binding assay 
is at least semi-purified proteins. By semi-purified, it is meant that the proteins 
utilized in the reconstituted mixture have been previously separated from other 
20 cellular or viral proteins. For instance, in contrast to cell lysates, the protein 

involved in the protein-based complex formation are present in the mixture to at 
least 50% purity relative to all other proteins in the mixture, and more preferably are 
present at 90-95% purity. 

Assaying the protein-based complexes of the application, in the presence or 
25 absence of a candidate agent, can be accomplished in any vessel suitable for 

containing the reactants. Examples include microtitre plates, test tubes, and micro- 
centrifuge tubes. 

In an exemplary binding assay, the agent or compound of interest is 
contacted with an Argonaute protein. Detection and quantification of the Argonaute 
30 protein-based complex (e.g., a co-complex formed by the Argonaute protein and the 
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compound) provides a means for determining the compound's affinity for the 
Argonaute protein. 

Protein-based complex formation may be detected by a variety of techniques, 
many of which are effectively described herein. For instance, formation of 
5 complexes can be quantitated using, for example, detectably labeled proteins (e.g., 
radiolabeled, fluorescently labeled, or enzymatically labeled), by immunoassay, or 
by chromatographic detection. Surface plasmon resonance systems, such as those 
available from Biacore International AB (Uppsala, Sweden), may also be used to 
detect binding interactions. 

1 0 Often, it will be desirable to immobilize the protein to facilitate separation of 

complexes from uncomplexed forms of agents to be assayed for their binding 
affinity to a protein, as well as to accommodate automation of the assay. In an 
illustrative embodiment, a fusion protein can be provided which adds a domain that 
peimits the protein (or a portion of the protein) to be bound to an insoluble matrix. 

1 5 For example, GST- Argonaute (or a portion thereof) fusion proteins can be adsorbed 
onto glutathione sepharose beads (Sigma Chemical, St. Louis, MO) or glutathione 
derivatized microtitre plates, which are then combined with test agents, e.g., a radio- 
or fluorescent-labeled agents, and incubated under conditions conducive to complex 
formation. Following incubation, the beads are washed to remove any unbound test 

20 agents, and the matrix bead-bound label(s) determined directly, or in the supernatant 
after the complexes are dissociated, e.g., when microtitre plate is used. 

RNAi 

The term "RNAi construct," as used herein, comprises nucleotides that 
hybridize under physiological condition to a portion of a target gene and attenuates 
25 expression of the target gene. In certain embodiments, the RNAi construct, when 
introduced into a cell, induces a sequence-specific RNA interference process. The 
RNAi construct used in the present application may be single-stranded siRNAs 
(ssRNAs), double-stranded siRNAs (dsRNAs), which includes short "hairpin" 
RNAs (shRNAs). An RNAi construct used in the present application may be single- 
30 stranded siRNAs (ssRNAs), double-stranded siRNAs (dsRNAs), which include 
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short "hairpin" RNAs (shRNAs). The RNAi construct may comprise one or more 
strands of polymerized ribonucleotide. It may include modifications to either the 
phosphate-sugar backbone or the nucleoside. For example, the phosphodiester 
linkages of natural RNA may be modified to include at least one of a nitrogen or 
5 sulfur heteroatom. Modifications in RNA structure may be tailored to allow specific 
genetic inhibition while avoiding a general panic response in some organisms which 
is generated by RNAi. Likewise, bases may be modified to block the activity of 
adenosine deaminase. The RNAi construct may be produced enzymatically or by 
partial/total organic synthesis, any modified ribonucleotide can be introduced by in 
10 vitro enzymatic or organic synthesis. 

The RNAi construct may be directly introduced into the cell (i.e., 
intracellularly); or introduced extracellularly into a cavity, interstitial space, into the 
circulation of an organism, introduced orally, or may be introduced by bathing an 
organism in a solution containing RNA. Methods for oral introduction include 
1 5 direct mixing of RNA with food of the organism, as well as engineered approaches 
in which a species that is used as food is engineered to express an RNA, then fed to 
the organism to be affected. Physical methods of introducing nucleic acids include 
injection of an RNA solution directly into the cell or extracellular injection into the 
organism. 

20 The double-stranded structure may be formed by a single self- 

complementary RNA strand (shRNA) or two complementary RNA strands. RNA 
duplex formation may be initiated either inside or outside the cell. The RNA may be 
introduced in an amount which allows delivery of at least one copy per cell. Higher 
doses (e.g., at least 5, 10, 100, 500 or 1000 copies per cell) of double-stranded 

25 material may yield more effective inhibition; lower doses may also be useful for 
specific applications. Inhibition is sequence-specific in that nucleotide sequences 
corresponding to the duplex region of the RNA are targeted for genetic inhibition. 

RNAi constructs containing a nucleotide sequences identical to a portion, of 
either coding or non-coding sequence, of the target gene are preferred for inhibition. 
30 RNA sequences with insertions, deletions, and single point mutations relative to the 
target sequence (ds RNA similar to the target gene) have also been found to be 
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effective for inhibition. Thus, sequence identity may be optimized by sequence 
comparison and alignment algorithms known in the art (see Gribskov and Devereux, 
Sequence Analysis Primer, Stockton Press, 1991, and references cited therein) and 
calculating the percent difference between the nucleotide sequences by, for example, 
5 the Smith- Waterman algorithm as implemented in the BESTFIT software program 
using default parameters (e.g., University of Wisconsin Genetic Computing Group). 
Greater than 90% sequence identity, or even 100% sequence identity, between the 
inhibitory RNA and the portion of the target gene is preferred. Alternatively, the 
duplex region of the RNA may be defined functionally as a nucleotide sequence that 

10 is capable of hybridizing with a portion of the target gene transcript (e.g., 400 mM 
NaCl, 40 mM PIPES pH 6.4, 1 mM EDTA, 50 °C. or 70 °C. hybridization for 12-16 
hours; followed by washing). In certain preferred embodiments, the length of the 
RNAi is at least 20, 21 or 22 nucleotides in length, e.g., corresponding in size to 
RNA products produced by Dicer-dependent cleavage. In certain embodiments, the 

1 5 RNAi construct is at least 25, 50, 100, 200, 300 or 400 bases. In certain 
embodiments, the RNAi construct is 400-800 bases in length. 

In certain embodiments, an shRNA construct is designed with about 29 bp 
helices. Further information on the optimization of shRNA constructs may be 
found, for example, in the following references: Paddison, et al.. Proc Natl Acad Sci 
20 USA, 2002. 99(3): p. 1443-8; 13. Brummelkamp, et al. Science, 2002. 21 : p. 21 ; 
Kawasaki, et al. Nucleic Acids Res, 2003. 31(2): p. 700-7; Lee et al. Nat Biotechnol, 
2002. 20(5): p. 500-5; Miyagishi, et al. Nat Biotechnol, 2002. 20(5): p. 497-500; 
Paul., et al., Nat Biotechnol, 2002. 20(5): p. 505-8. 

The RNAi construct may be synthesized either in vivo or in vitro. 

25 Endogenous RNA polymerase of the cell may mediate transcription in vivo, or 
cloned RNA polymerase can be used for transcription in vivo or in vitro. For 
transcription from a transgene in vivo or an expression construct, a regulatory region 
(e.g., promoter, enhancer, silencer, splice donor and acceptor, polyadenylation) may 
be used to transcribe the RNAi strand (or strands). Inhibition may be targeted by 

30 specific transcription in an organ, tissue, or cell type; stimulation of an 

environmental condition (e.g., infection, stress, temperature, chemical inducers); 
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and/or engineering transcription at a developmental stage or age. The RNA strands 
may or may not be polyadenylated; the RNA strands may or may not be capable of 
being translated into a polypeptide by a cell's translational apparatus. The RNAi 
construct may be chemically or enzymatically synthesized by manual or automated 
5 reactions. The RNAi construct may be synthesized by a cellular RNA polymerase 
or a bacteriophage RNA polymerase (e.g., T3, T7, SP6). The use and production of 
an expression construct are known in the art (see also WO 97/32016; U.S. Pat. Nos. 
5,593,874, 5,698,425, 5,712,135, 5,789,214, and 5,804,693; and the references cited 
therein). If synthesized chemically or by in vitro enzymatic synthesis, the RNA may 

10 be purified prior to introduction into the cell. For example, RNA can be purified 
from a mixture by extraction with a solvent or resin, precipitation, electrophoresis, 
chromatography or a combination thereof. Alternatively, the RNAi construct may 
be used with no or a minimum of purification to avoid losses due to sample 
processing. The RNAi construct may be dried for storage or dissolved in an aqueous 

1 5 solution. The solution may contain buffers or salts to promote annealing, and/ or 
stabilization of the duplex strands. 

Physical methods of introducing nucleic acids include injection of a solution 
containing the RNAi construct, bombardment by particles covered by the RNAi 
construct, soaking the cell or organism in a solution of the RNA, or electroporation 

20 of cell membranes in the presence of the RNAi construct. A viral construct 

packaged into a viral particle would accomplish both efficient introduction of an 
expression construct into the cell and transcription of RNAi construct encoded by 
the expression construct. Other methods known in the art for introducing nucleic 
acids to cells may be used, such as lipid-mediated carrier transport, chemical 

25 mediated transport, such as calcium phosphate, and the like. Thus the RNAi 

construct may be introduced along with components that perform one or more of the 
following activities: enhance RNA uptake by the cell, promote annealing of the 
duplex strands, stabilize the annealed strands, or other-wise increase inhibition of the 
target gene. 

30 "Inhibition of gene expression" refers to the absence or observable decrease 

in the level of protein and/or mRNA product from a target gene. "Specificity" refers 
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to the ability to inhibit the target gene without manifest effects on other genes of the 
cell. The consequences of inhibition can be confirmed by examination of the 
outward properties of the cell or organism (as presented below in the examples) or 
by biochemical techniques such as RNA solution hybridization, nuclease protection, 
5 Northern hybridization, reverse transcription, gene expression monitoring with a 
microarray, antibody binding, enzyme linked immunosorbent assay (ELISA), 
Western blotting, radioimmunoassay (RIA), other immunoassays, and fluorescence 
activated cell analysis (FACS). For RNA-mediated inhibition in a cell line or whole 
organism, gene expression is conveniently assayed by use of a reporter or drug 

10 resistance gene whose protein product is easily assayed. Such reporter genes include 
acetohydroxyacid synthase (AHAS), alkaline phosphatase (AP), beta galactosidase 
(LacZ), beta glucoronidase (GUS), chloramphenicol acetyltransferase (CAT), green 
fluorescent protein (GFP), horseradish peroxidase (HRP), luciferase (Luc), nopaline 
synthase (NOS), octopine synthase (OCS), and derivatives thereof. Multiple 

1 5 selectable markers are available that confer resistance to ampicillin, bleomycin, 
chloramphenicol, gentamycin, hygromycin, kanamycin, lincomycin, methotrexate, 
phosphinothricin, puromycin, and tetracyclin. 

Depending on the assay, quantitation of the amount of gene expression 
allows one to determine a degree of inhibition which is greater than 10%, 33%, 50%, 

20 90%, 95% or 99% as compared to a cell not treated according to the present 
application. As an example, the efficiency of inhibition may be determined by 
assessing the amount of gene product in the cell: mRNA may be detected with a 
hybridization probe having a nucleotide sequence outside the region used for the 
inhibitory double-stranded RNA, or translated polypeptide may be detected with an 

25 antibody raised against the polypeptide sequence of that region. 

As disclosed herein, the present application is not limited to any type of 
target gene or nucleotide sequence. In some preferred embodiments, the target gene 
is an essential gene or a gene which is essential for cell viability. The following 
classes of possible target genes are listed for illustrative purposes: developmental 
30 genes (e.g., adhesion molecules, cyclin kinase inhibitors, Writ family members, Pax 
family members, Winged helix family members, Hox family members, cytokines, 
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lymphokines and their receptors, growth/differentiation factors and their receptors, 
neurotransmitters and their receptors); oncogenes (e.g., ABLI, BCLI, BCL2, BCL6, 
CBFA2, CBL, CSFIR, ERBA, ERBB, EBRB2, ETSI, ETS1, ETV6, FGR, FOS, 
FYN, HCR, HRAS, JUN, KRAS, LCK, LYN, MDM2, MLL, MYB, MYC, MYCLI, 
5 MYCN, NRAS, PIM 1, PML, RET, SRC, TALI, TCL3, and YES); tumor 

suppressor genes (e.g., APC, BRCA1, BRCA2, MADH4, MCC, NF 1, NF2, RB 1, 
P53, BIM, PUMA and WTI); and enzymes (e.g., ACC synthases and oxidases, ACP 
desaturases and hydroxylases, ADP-glucose pyrophorylases, ATPases, alcohol 
dehydrogenases, amylases, amyloglucosidases, catalases, cellulases, chalcone 

10 synthases, chitinases, cyclooxygenases, decarboxylases, dextrinases, DNA and RNA 
polymerases, galactosidases, glucanases, glucose oxidases, granule-bound starch 
synthases, GTPases, helicases, hemicellulases, integrases, inulinases, invertases, 
isomerases, kinases, lactases, lipases, lipoxygenases, lysozymes, nopaline synthases, 
octopine synthases, pectincstcrases, peroxidases, phosphatases, phospholipases, 

1 5 phosphorylases, phytases, plant growth regulator synthases, polygalacturonases, 
proteinases and peptidases, pullanases, recombinases, reverse transcriptases, 
RUBISCOs, topoisomerases, and xylanases). 

The application also provides variations of the methods described herein, 
wherein gene expression of more than one gene is achieved. This may be achieved 
20 for example, by expressing multiple shRNAs, or by designing an shRNA to inhibit 
the gene expression of two or more genes which share substantial nucleotide 
sequence identity in a short stretch, preferably at least 90% identity over a length of 
20, 22, 25, 27, or 30 nucleotides. 

The compositions of the present application may be used to enhance the 
25 therapeutic effectiveness of a RNAi therapeutics. Exemplary RNAi therapeutics 

includes double-stranded ribonucleic acids (dsRNAs) for inhibiting the expression of 
a K-ras oncogene in a cell for treating pancreatic cancer, described in 
US20040121348, double-stranded ribonucleic acids (dsRNAs) having nucleotide 
sequences substantially identical to at least a part of a 3 '-untranslated region (3 ! - 
30 UTR) of a (+) strand RNA virus useful for treating hepatitis C infection, described 
inUS20040091457, siRNAs that down-regulate expression of neurite growth 
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inhibitor receptor, prostaglandin D2 receptor, IkappaB kinase or protein kinase PKR 
genes, useful for treating cancer and inflammatory disease, described in U.S. Patent 
Application Publication No. 20030191077. 

Furthermore, the crystal structure, the electronic representation, as well as 
5 other aspects of the application also relate to a method for identifying, designing, 
and/or optimizing an RNAi construct or RNAi therapeutic of the application. For 
example, based on the structure of the PAZ domain, particular the site that may 
interact with the 3' end of a nucleic acid (e.g., an RNA or a portion of an RNAi 
construct), the nucleic acid sequence or structure may be designed and/or optimize 
10 to increase or decrease the nucleic acid's interaction with the PAZ domain. 

Similarly, based on the PIWI domain as well as the interface between the PIWI 
domain and the PAZ domain, an RNAi construct or RNAi therapeutic may be 
designed and/or optimized. An optimized RNAi therapeutic may have an improved 
pharmacokinetic and/or pharmacodynamic profile. 
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5 All references cited herein including the numbered references above and 

others throughout the application are incorporated by reference in their entirety. 

EQUIVALENTS 

While this invention has been particularly shown above and in the following 
examples and described with references to preferred embodiments thereof, it will be 
10 understood by those skilled in the art that various changes in form and details may 
be made therein without departing from the scope of the invention encompassed by 
the appended claims. 

EXEMPLIFICATION 

Example 1. DNA constructs and site-directed mutagenesis 

1 5 cDNAs encoding full length human Ago 1 , Ago2, and Ago3 were generated 

by RT-PCR from RNAs extracted from 293T, HeLa or S2 cells. Plasmids 
expressing various Argonaute proteins were made by cloning the cDNAs into a 
pcDNA3-based myc-epitope tagging vector. Mutations were introduced by site- 
directed mutagenesis using the QuickChange Kit (Stratagene). 

20 Example 2. Human Cell Culture and transfection 

Human 293T cells were cultured in DMEM (10% FBS) in a 37 °C incubator 
with 5% C02. Cell transfections were carried out using calcium-phosphate buffer or 
Mirus TransIT-LTl transfection reagent. Luciferase GL3 siRNA duplex was 
purchased from Dharmacon. siRNA transfection was carried out by using 
25 Oligofectamine (Invitrogen). Procedures for immunoprecipitation and 

immunoblotting were described previously (Caudy et al, Genes. Dev. 16, 2491 
(2002)). Lysis buffer contained 0.5% NP-40, 150mM NaCl, 2 mM MgCl 2 , 2mM 
CaCl 2 and 20mM Tris-HCl pH 7.5. Protease inhibitor and DTT (final 2mM) were 
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added immediately before lysis. The antibody to the myc tag (9E10) was purchased 
from Neomarker. RNAs associated with the Ago immunocomplexes were isolated 
using phenol-chloroform/chloroform extraction and ethanol precipitation. RNAs 
were stained using SYBR Gold from Molecular Probes. Small RNA Northern 
5 blotting was carried out as described previously (Caudy et al., supra). 

Example 3. mRNA Cleavage assays and in vitro reconstitution of RISC activity 

Capped and uniformly radiolabeled Luciferase mRNA target was in vitro 
transcribed using the Riboprobe system from Promega and was purified using 
PAGE as described previously. The immunoaffinity purified Ago complexes were 

10 first resuspended in 10 ul buffer containing lOOmM KC1, 2mM MgCl 2 and lOmM 
Tris pH7.5. For in vitro reconstitution of RISC activity, 4 ul of 1 uM in vitro 
phosphorylated (except where noted) single-stranded siRNA, duplexed siRNA or 
single-stranded DNA were added to the mix and incubated at 30 °C for 30 minutes. 
The final reaction was carried out in 20 ul which also contained lmM ATP, 0.2 mM 

15 GTP, 8 units of RNAsin, 0.3 (ig Creatine phosphokinase and 25 mM creatine 

phosphate. No- ATP reactions lacked ATP, GTP and the regeneration system. After 
a 2 hour incubation at 30 °C , RNAs were extracted using Trizol and chloroform and 
precipitated with isopropyl alcohol. 

Example 4. Gene targeting and mice 

20 Targeting construct was obtained by screening the lambda phage 3 ' HPRT 

library described in (Zheng et al., Nucleic Acids Res. 27, 2354 (1999)). The 
resultant targeting construct, containing exons 3-6 of mAgo2, was electroporated 
into mouse embryonic stem (ES) cells. Targeted clones were injected into C57BL/6 
blastocysts to generate chimeras, which were crossed with C57BL/6 mice. Mouse 

25 genotyping was performed by Southern blot after digestion of genomic DNA with 
Hindlll. The probe was amplified from genomic DNA using primer sequences 
5 ' GACAATAGTGCAGAGACTTGC3 ' and 5 'GGGCAGCCTGAGAATTGA3 ' . 
GenBank Accession Number for mouse Ago2 is AB081472. The Ago2 gene trap 
cell line RRE192 was obtained from Bay Genomics(Stryke et al., Nucleic Acids 

30 Res. 31, 278 (2003)). 
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Example 5. In situ hybridization 

In situ hybridization was performed on whole-mount embryos essentially as 
described (Belo et al., Mech Dev. 68, 45 (1997)). Riboprobes for in situ 
hybridization were synthesized from T7-promoter containing PCR products 
5 corresponding to the 3 ' UTRs of Ago2 or Ago3 . The Ago2 probe was amplified 
from genomic DNA using the primers 5 ' AGCTGTGAAGGCTCTGAG3 ' and 
5'CAGTCCTACAGGACAAATCT3\ and the Ago3 probe was similarly 
constructed using primers, AGGCTGTACAGATTCACCAAGATA and 
CCTTTACAAGAATAGATGCACATT. 

10 Example 6. MEF Culture, transfection, and gene silencing assays 

Day 10.5 embryos were dissected and diced in trypsin. Mouse embryo 
fibroblasts (MEFs) were cultured in DMEM + 10% FBS. MEFs were transfected in 
24 well plates using Lipofectamine reagent according to the manufacturer's 
recommendations. Where indicated, each well received 2.5 picomoles of siRNA 

15 and 1 ug of plasmid DNA. Dual luciferase assays (Promega) were carried out by 
cotransfecting cells with plasmids containing firefly luciferase under the control of 
the SV40 promoter (pGL3 -Control, Promega) and Renilla luciferase under the 
control of the SV40 early enhancer/promoter region (pSV40, Promega). Luciferase 
siRNA was obtained from Dharmacon (siStarter, anti-luc siRNA-1). GFP (pEGFP- 

20 CI) and dsRed (pDsRed-express-Nl) plasmids were obtained from Clontech. EGFP 
siRNA was obtained from Dharmacon (EGFP duplex). Agol and Ago2 expression 
plasmids were as described for the IP experiments, except that proteins were fused 
to an HA tag rather than a myc tag. Constructs for the translational repression assay 
were kindly provided by P. Sharp (Doench et al., Genes Dev. 17, 438 (2003)). 

25 Example 7. RT-PCRs 

RNA was extracted from cells and embryos using Trizol Reagent. Reverse 
transcription was conducted using Superscript-II RT from Invitrogen according to 
manufacturer's instructions. Subsequent PCR reactions were carried out using the 
following primers (5 '-3'): mAgol, GCATTTCAAGCAGAAATATAACCTTCA 
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and AGACTTTGATCTCAATCCC 

ATTGTAG. MAgo2, GTACTTCAAGGACAGGCACAAGCTG and 
TGGCAATTGC 

TTTGTTCCTGC. MAgo3, GCTGCAGCTGAAGTACCCACA and 
5 GTACTGGAGCATA 

GGTGCTGGAAGTA. Mouse /3-actin, CACTATTGGCAACGAGCGGT and 

CTTCATGGT 

GCTAGGAGCCA. 

Example 8. MiRNA microarrays 

1 0 RNA was recovered from immunoprecipitates with Trizol (Invitrogen) and 

conjugated with a Cy3 dinucleotide using T4 RNA ligase (NEB). Labeled RNA was 
hybridized to microarrays containing probes to 1 52 human mature microRNA 
sequences, washed, and scanned on a Genepix 400B array scanner. Log-ratios of 
Cy3/Cy5 values were global median center normalized for Ago-1, Ago-2, Ago-3 

1 5 immunoprecipitates. For the control immunoprecipitate, data was normalized by a 
constant that was the average of the normalization constant for the Ago-1, Ago-2, 
Ago-3 datasets. Data was sorted in descending order for the Ago-2 dataset and a 
heat map generated using Treeview (Stanford University). 

Example 9. miRNA Microarray Results. 

20 Ago 1 -, Ago2- and Ago3 -associated RNAs were hybridized to microarrays 

that report the expression status of 1 52 human microRNAs. Patterns of associated 
RNAs were identical within experimental error in each case (Fig. 9, Panel A). 
Additionally, each of the tagged Ago proteins associated similarly with a co- 
transfected siRNA (Fig. 9, Panel C). Previous studies have used tagged siRNAs to 

25 affinity purify Argonaute-containing RISC (Martinez et al., supra). These 

preparations, containing mixtures of at least two mammalian Argonautes, were 
capable of cleaving synthetic mRNAs that were complementary to the tagged 
siRNA. The ability of purified complexes containing individual Argonaute proteins 
to catalyze similar cleavages was examined. Surprisingly, irrespective of the siRNA 

30 sequence, only Ago2-containing RISC was able to catalyze cleavage (Fig. 9, Panel 
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B; Fig. 14). All three Ago proteins were similarly expressed and bound similar 
amounts of transfected siRNA (Fig. 1 Panels C and D). 

These results demonstrated that mammalian Argonaute complexes are 
biochemically distinct, with only a single family member being competent for 
5 mRNA cleavage. To examine the possibility that Ago proteins might also be 

biologically specialized, the mouse Ago2 gene were disrupted by targeted insertional 
mutagenesis (Fig. 15; Fig. 10, Panel A) (Zheng et al., supra). Intercrosses of Ago2 
heterozygous produced only wild-type and heterozygous offspring, strongly 
suggesting that disruption of Ago2 produced an embryonic-lethal phenotype. Ago2 

10 deficient mice display several developmental abnormalities beginning approximately 
halfway through gestation. Both gene-trap and in situ hybridization data of day 9.5 
embryos show broad expression of Ago2 in the embryo, with some hotspots of 
expression in the forebrain, heart, limb buds and branchial arches (Fig 10, Panels F 
and G). The most prominent phenotype is a defect in neural tube closure (Fig. 10, 

1 5 Panels D and E), often accompanied by apparent mispatteming of anterior structures 
including the forebrain (Fig. 10, Panels C and D). Roughly half of the embryos 
display complete failure of neural tube closure in the head region (Fig. 10, Panel E), 
while all embryos display a wavy neural tube in more caudal regions. Mutant 
embryos also suffer from apparent cardiac failure. The hearts are enlarged, and 

20 often accompanied by pronounced swelling of the pericardial cavity (Fig. 1 0, Panel 
C). By day 10.5, mutant embryos are severely developmentally delayed compared 
to wildtype and heterozygous littermates (Fig. 10, Panel B). This large difference in 
size, like the apparent cardiac failure, may be accounted for by a general nutritional 
deficiency caused by yolk sac and placental defects (Conway et al., Genesis 35, 1 

25 (2003)), as histological analysis reveals abnormalities in these tissues. 

Not all Argonaute proteins are required for successful mammalian 
development (Deng et al., Cell 2, 819, (2002); Kuramochi-Miyagawa et al., 
Development 131, 839 (2004)). Ago subfamily members are expressed in 
overlapping patterns in humans (Sasaki et al., Genomics 82, 323 (2003)). In situ 
30 hybridization demonstrates overlapping expression patterns for Ago2 and Ago3 in 
mouse embryos (Fig 10, Panel F; Fig. 16). Considered together with the essentially 
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identical patterns of miRNA binding, the results suggest the possibility that the 
ability of Ago2 to assemble into catalytically active complexes might be critical for 
mouse development. Although most miRNAs regulate gene expression at the level 
of protein synthesis, recently miR196 has been demonstrated to cleave the mRNA 
5 encoding HoxB8, a developmental regulator (Yekta et al., Science 304, 594 (2004)). 
Evolutionary conservation of an essential cleavage-competent RISC in organisms in 
which miRNAs predominantly act by translational regulation raises the possibility 
that target cleavage by mammalian miRNAs might be more important and 
widespread than previously appreciated. 

1 0 Numerous studies have indicated that experimentally triggered RNAi in 

mammalian cells proceeds through siRNA-directed mRNA cleavage since in many, 
but not all, cases reiterated binding sites are necessary for repression at the level of 
protein synthesis (see for example (Battel, Cell 116, 281 (2004); Doench et al., 
supra; Kiriakidou et al, Genes Dev. 18, 1 165 (2004)). If Ago2 were uniquely 

15 capable of assembling into cleavage competent complexes in mice, then embryos or 
cells lacking Ago2 might be resistant to experimental RNAi. To address this 
question, mouse embryo fibroblasts (MEF) were prepared from El 0.5 embiyos from 
Ago2 heterozygous intercrosses. RT-PCR analysis and genotyping revealed that 
wild-type, mutant and heterozygous MEF populations were obtained. Importantly, 

20 MEF also express other Ago proteins, including Agol and Ago3 (Fig. 11, Panel A). 
Ago2 null MEF were unable to repress gene expression in response to an siRNA 
(Fig. 11, Panel B; Fig. 17). This defect could be rescued by addition of a third 
plasmid that encoded human Ago2 but not by Agol (Fig. 1 1, Panel B). In contrast, 
responses were intact for a reporter of repression at the level of protein synthesis, 

25 mediated by an siRNA binding to multiple mismatched sites (Doench et al., supra) 
(Fig. 11, Panel C). 

Example 10. Mapping of determinants for cleavage 

Since Ago2 was unique in its ability form cleavage-competent complexes, 
determinants of this capacity were mapped. Deletion analysis indicated that an 
30 intact Ago2 was required for RISC activity (Fig. 1 8). Therefore, the sequence of 
highly conserved but cleavage-incompetent Ago proteins was used as a guide to the 
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construction of Ago2 mutants. A series of point mutations included H634P, H634A, 
Q633R, Q633A, H682Y, L140W, F704Y and T744Y. While all of these mutations 
retain siRNA binding activity and most retain cleavage activity, changes at Q633 
and H634 have a profound effect on target cleavage (Fig. 12). Both the Q633R and 
5 H634P mutations, in which residues were changed to corresponding residues in 
Agol/3, abolished catalysis. Changing H634 to A also inactivated Ago2, while a 
similar change, Q633A, was permissive for cleavage. Thus, even relatively 
conservative changes can negate the ability of Ago2 to form cleavage-competent 
RISC. 

10 Several possibilities could explain a lack of cleavage activity for Ago2 

mutants. Such mutations could interfere with the proper folding of Ago2. However, 
this seems unlikely as those same residues presumably permit proper folding in 
closely related Argonaute proteins, and mutant Ago2 proteins retained the ability to 
interact with siRNAs. Alternatively, cleavage-incompetent Ago2 mutants could lose 

15 the ability to interact with the putative Slicer. Finally, Ago2 itself might be Slicer, 
with the conservative substitutions altering the active center of the enzyme in a way 
that prevents cleavage. The last possibility predicted that an active enzyme with 
relatively pure Ago2 protein may be reconstituted. Ago2 was immunoaffinity 
purified from 293T cells and attempted to reconstitute RISC in vitro. Incubation 

20 with the double-stranded siRNA produced no significant activity, whereas Ago2 
could be successfully programmed with single-stranded siRNAs to cleave a 
complementary substrate (Fig. 13, Panel A). Formation of the active enzyme was 
unaffected by first washing the immunoprecipitates with up to 2.5M NaCl or 1M 
urea. A 21nt single stranded DNA was unable to direct cleavage (Fig. 13, Panel A). 

25 Programming could be accomplished with different siRNAs that direct activity 

against different substrates (Fig. 19). RISC is formed though a concerted assembly 
process in which the RISC-Loading Complex (RLC) acts in an ATP-dependent 
manner to place one strand of the small RNA into RISC (Nykanen et al., Cell 107, 
309 (2001); Pham et al., Cell 117, 83 (2004); Tomari et al., Cell 1 16, 831 (2004)). 

30 In vitro reconstitution occurs in the absence of ATP, suggesting that Ago2 could be 
programmed with siRNAs without a need for the normal assembly process (Fig. 13, 
Panel A). However, in vitro reconstitution of RISC still required the essential 
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characteristics of an siRNA. For example, single-stranded siRNAs that lack a 5' 
phosphate group cannot reconstitute an active enzyme. 

While consistent with the possibility that the catalytic activity of RISC is 
carried within Ago2, these results do not rule out the possibility that a putative Slicer 
5 co-purifies with Ago2. To demonstrate more conclusively that Ago2 is Slicer, the 
crystal structure of an Argonaute protein from an archebacterium, Pyrococcus 
furiosus, was analyzed. This structure revealed that the PIWI domain folds into a 
structure analogous to the catalytic domain of RNAseH and ASV integrase. The 
notion that such a domain would lie at the center of RISC cleavage is consistent with 

10 previous observations. RNAseH and integrases cleave their substrates leaving 5' 
phosphate and 3' hydroxyl groups through a metal catalyzed cleavage reaction 
(Chapados et al., J. Mol. Biol. 307, 541 (2001); Yang et al., Strcuture 3, 131 (1995)). 
Notably, previous studies have strongly indicated that the scissile phosphate in the 
targeted mRNA is cleaved via a metal ion in RISC to give the same phosphate 

15 polarity (Schwarz et al., Curr. Biol. 14, 787 (2004)). The in vitro data are consistent 
with the reconstituted RISC also requiring a divalent metal (Fig. 20). The active 
center of RNAseH and its relatives consists of a catalytic triad of three carboxylate 
groups contributed by aspartic or glutamic acid (Chapados et al., supra; Yang et al., 
supra). These coordinate the essential metal and activate water molecules for 

20 nucleolytic attack. Reference to the known structure of RNAseH reveals two 

aspartate residues in the archeal Ago protein present at the precise spatial locations 
predicted for formation of an RNAseH-like active site. These align with identical 
residues in the human Ago2 protein (Fig. 21). Therefore, to test whether the PIWI 
domain of Ago2 provides catalytic activity to RISC, the two conserved aspartates, 

25 D597 and D669, were changed to alanine, with the prediction that either mutation 
would inactivate RISC cleavage. Consistent with this hypothesis, the mutant Ago2 
proteins were incapable of assembling into a cleavage-competent RISC in vitro or in 
vivo, despite retaining the ability to bind siRNAs (Fig. 13, Panels B-D). 

Considered together, the data provide strong support for the notion that 
30 Argonaute proteins are the catalytic components of RISC. Firstly, the ability to form 
an active enzyme is restricted to a single mammalian family member, Ago2. This 
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conclusion is supported both by biochemical analysis and by genetic studies in 
mutant MEF. Secondly, single amino acid substitutions within Ago2 that convert 
residues to those present in closely related proteins negate RISC cleavage. Thirdly, 
the structure of the P. furiosis Argonaute protein reveals provocative structural 
5 similarities between the PIWI domain and RNAseH domains, providing a 
hypothesis for the method by which Argonaute cleaves its substrates. This 
hypothesis was tested by introducing mutations in the predicted Ago2 active site. 

Example 11. Protein expression and purification 

The full length Argonaute gene from Pyroccocus furiosus (PfAgo) was 
10 cloned into a pSMT3 vector. PfAgo was expressed as an Smt3 fusion with an N- 
terminal histidine tag in BL21-PJPL cells. Smt3_Argonaute protein was purified 
with an NTA-agarose affinity column, and Smt3 was removed using Ulpl protease, 
which cuts right after Smt3. The pSMT3 vector-Ulpl protease system was a 
generous gift from Dr. Chris Lima. PfAgo was further purified with a heating step, 
1 5 as this protein is from a hyperthermophilic organism, anion exchange 

chromatography and gel filtration. Purified protein was concentrated to 12.5 mg/ml 
in 50mM Tris-HCl (pH8.0) and 300 mM NaCl. Se-Met substituted protein was 
expressed using metabolic inhibition of methionine biosynthesis as described in 
(G.D. Van Duyne, R.F. Standaert, P.A. Karplus, S.L. Schreiber, J. Clardy, J Mol 
20 Biol 229, 105-24 (1993)). Se-Met incorporation was confirmed by mass 
spectrometry. 

Example 12. Crystallization and data collection 

Initial crystals were grown by vapor diffusion using the hanging-drop 
method in the presence of organic solvents. The quality of crystals was significantly 
25 improved by several rounds of microseeding. Selenomethionine (Se-Met) 

substituted protein crystals were obtained by microseeding with native crystals. 
Mercury-derivatized crystals were prepared by soaking native crystals in ImM p- 
chloromercuriphenylsulfonic acid for 5 hours. For cryoprotection crystals were 
soaked for 1 min in crystallization solution containing increasing amounts of 
30 ethylenglycol (EG) in 5% steps to a final EG concentration of 40%(v/v). Crystals 
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diffracted to approximately 2 A resolution. All data were collected to a resolution of 
2.25A under cryogenic conditions (100K) at beamline X25 at the National 
Synchrotron Light Source (NSLS) at Brookhaven National Laboratory. Data were 
processed with HKL2000 (http://www.hkl-xray.com) (Table 1 provided in Figure 
5 23). 

Crystallization condition for native crystal: 

1) Well solution as Water; and 2) Mixing 2 ul of 12.5 mg/ml PfAgo protein 
with 1 pi of water and 0.3 ul of 7% 1-butanol 

Crystallization condition for Se-crystal: 

10 1) Well solution as Water; and 2) Mixing 2 ul of 12.5 mg/ml PfAgo protein 

with 0.3 ul of 7% 1-butanol. 

Example 13. Structure determination 

Phases were calculated from a three-wavelength anomalous dispersion 
(MAD) experiment at the selenium inflection, peak and high remote energies using a 

15 Se-Met substituted crystal at the peak energy for the mercury derivative. 17 
selenium sites were located using SnB (CM. Weeks, R. Miller, J. of Applied 
Crystallography 32, 120-124 (1999)) and a single Hg site was located by calculating 
an anomalous difference Fourier map using initial phases calculated from the 
selenium data. Data from all three wavelengths for the Se-Met derivative and one 

20 wavelength for the Hg derivative were used for heavy atom site refinement by the 
program SHARP (E. delaFortelle, G. Bricogne, Meth. Enzymol. 276, 472-494 
(1997)), followed by solvent flattening. A partial model was built using the program 
wARP (A. Perrakis, R. Morris, V.S. Lamzin, Nature Structure Biol. 6, 458-463 
(1999)). The program SIGMAA (C.C.C.P.N.4. (Acta Crystallogr. D50, 760, 

25 Daresbury, UK, 1 994)) was used to combine the partial structure model with the 
experimental phases. Iterative model building using the program O (T.A. Jones, M. 
Kjeldgaard, Methods Enzymol. 277, 173-208 (1997)) and crystallographic 
refinement with the program CNS (A.T. Brunger et al., Acta Crystallogr. D54, 905- 
921 (1998)) lead to the final model that contains 5913 protein atoms, and 77 water 
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molecules (Table 1 provided in Figure 23). Several loops are disordered in the 
structure and were not included: L26-G38, 1253-K256, E278-V281, L347-L354, and 
S414-K442. 

Example 14. UV cross linking 

5 PfAgo or GST were incubated with a 21-mer 5'-32 P-labeled ssRNA with an 

IodoU at the 5' end and unlabeled competitor ssRNA for 30 min at 30 °C. 
Incubation was carried out in 10 mM Tris-HCl (pH 7.5), 2 mM MgCl 2 , and 150 mM 
KC1. UV crosslinking was done using a Stratalinker (Stratagene) at 3 12 nm for 20 
min at room temperature. Double-stranded RNA probes were gel purified after 
1 0 annealing the 5 ' - 32 P-labeled ssRNA with an unlabeled complementary strand to 
form a ds-siRNA (including a 2-nucleotide 3 'overhang and a 5 '-phosphate group). 
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CLAIMS 

1 . A crystalline Argonaute. 

2. A method of determining the three-dimensional structure of an Argonaute 
protein or a mutant, derivative, variant, analogue, homologue, sub-domain or 

5 fragment thereof comprising: 

(a) aligning the amino acid sequence of the Argonaute mutant, 
derivative, variant, analogue, homologue, sub-domain or fragment with the amino 
acid sequence set forth in SEQ ID NO: 5 to match homologous regions of the amino 
acid sequences; 

10 (b) modelling the structure of the matched homologous regions of said 

target Argonaute protein of unknown structure on the corresponding regions of the 
Argonaute protein structure as defined by the atomic co-ordinates as set forth in 
Table 3; and 

(c) determining a conformation for the Argonaute mutant, derivative, 
1 5 variant, analogue, homologue, sub-domain or fragment which substantially 
preserves the structure of said matched homologous regions. 

3. A method of identifying an agent that binds an Argonaute protein 
comprising: 

(a) applying a 3 -dimensional molecular modeling algorithm to the 
20 atomic coordinates of an Argonaute protein shown in Table 3 to determine the 

spatial coordinates of the binding pocket of the Argonaute protein; and 

(b) electronically screening the stored spatial coordinates of a set of 
candidate agents against the spatial coordinates of the Argonaute protein binding 
pocket to identify agents that can bind to the Argonaute protein. 

25 4. A computer-based method for the analysis of the interaction of a 

molecular structure with an Argonaute protein, comprising: 
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(a) providing a structure comprising a three-dimensional representation 
of said Argonaute protein or a portion thereof, which representation comprises all or 
a portion of the coordinates set forth in Table 3; 

(b) providing a molecular structure to be fitted to said Argonaute protein 
5 structure; and 

(c) fitting the molecular structure to the Argonaute protein structure of 

(a). 

5. A computer-readable storage medium encoded with the atomic 
coordinates or an Argonaute protein as shown in Table 3. 

10 6. A data array comprising the atomic coordinates of an Argonaute protein as 

set forth in Table 3. 

7. An electronic representation of a crystal structure of an Argonaute protein. 

8. An electronic representation of a binding site of the Argonaute protein. 

9. An electronic representation of a domain of the Argonaute protein. 

15 1 0 . An electronic representation of an agent in a binding site of an Argonaute 

protein. 

1 1 . A method for obtaining a crystal of an Argonaute protein, comprising 
subjecting an Argonaute protein at 10-15 mg/ml to crystallization conditions for a 
time sufficient for crystal formation. 

20 12. A method of identifying an agent that modulates the activity of an RNAi 

construct, comprising identifying an agent that modulates the expression and/or 
activity of an Argonaute protein. 

13. A method of identifying an agent that potentiates the activity of an RNAi 
construct, comprising identifying an agent that increases the expression and/or 
25 activity of an Argonaute protein. 
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14. A method of identifying an agent that modulates the activity of an RNAi 
construct comprising: 

(a) providing an isolated or recombinant Argonaute protein; and 

(b) assaying the activity of said Argonaute protein in the presence of 
5 a candidate agent, 

wherein a change in the activity of said Argonaute protein in the 
presence of a candidate agent is indicative of said candidate agent capable of 
modulating the activity of an RNAi construct. 

15. A composition for targeted gene inhibition comprising an agent that 
10 modulates the RNase activity of an Argonaute protein. 

16. A pharmaceutical composition comprising the composition of claim 15 
and a physiologically acceptable carrier. 

17. A cell line that overexpresses an Argonaute protein. 

18. An assay for identifying nucleic acid sequences for conferring a 
1 5 particular phenotype in a cell, comprising: 

(a) constructing a library of nucleic acid sequences oriented to 
produce double stranded RNA; 

(b) introducing a dsRNA library into a culture of target cell line of 

claim 17; 

20 (c) identifying members of the library which confer a particular 

phenotype on the cell, and identifying the sequence from the cell which is identical 
or homologous to the library member. 

19. A nucleic acid composition comprising: 

(a) a first nucleic acid comprising an RNAi construct and 



25 



(b) a second nucleic acid encoding an Argonaute protein. 
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20. The nucleic acid composition of claim 19, wherein the RNAi construct 
comprises a nucleotide sequence encoding a single-strand siRNA. 

21 . A pharmaceutical composition comprising the nucleic acid composition 
of claim 19 and a physiologically acceptable carrier. 

5 22. A cell expressing the nucleic acid composition of claim 1 9. 
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Figure 8 
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Figure 10 
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Figure 16 
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Figure 20 
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Figure 24 



Table 2. Crystallographic Statistics 



>9.72, b=104.19, 0=74.01, a=90, 6=102.83, 7=90 



A. Data Reduction Statistics 





\{k) 


Resolution(A) 


Measured 
Reflections 


Unique 
Reflections 


Complete 


l/o(D 


Se peak 
Se edge 
Se remote 
Hg peak 


0.9791 
0.9796 
0.9638 
1 .0076 


2.25(2.33-2.25) 
2.25(2.33-2.25) 
2.25(2.33-2.25) 
2.25(2.33-2.25) 


350855 
351677 
354296 
354357 


48108(4820) 
48228(4852) 
48470(4848) 
48293(4781) 


98.7(100) 
98.7(100) 
99.3(100) 
99.8(100) 


42.6(6.03) 
41.8(5.50) 
41.6(4.34) 
39.4(4.24) 



B. Phasing Statistics 



C. Refine ment Stat istics 



Reflection used 
Number of atoms 
R wo , k /#ref 
R„ ee /#ref 

D. Geometry 



Acentric Phasing Power" 



Se peak anomalous 
Se edge isomorphous 
Se edge anomalous 
Se remote isomorphous 
Se remote anomalous 
Hg peak isomorphous 
Hg peak anomalous 


2.912 
0.903 
1.234 
0.596 
1.389 
0.801 
0.301 




0.616 
0.294 
0.267 
0.181 
0.313 
0.4C6 
0.C69 


0.690 
0.458 
0.717 


0.212 
0.313 
0.331 


FOM° 

Accentric reflections 
Centric reflections 




50-2.25 A 
0.553 
0.305 






2.31-2.25 A 
0.305 
0.177 



38.14-2.25 A 

6113(protoin:5921,water:192) 
0.228/43294 



Bond length RMSD(A) 
Bond angle RMSD( ) 



"' RsymlRano = 2V-<>\I2 ('), calculated by retaining anomalous-mates and symmetry- 
" Phasing power calculated as F H (calc)phase integrated lack of closure. 
°FOM is weighted over F amplitude and phase as calculated by the program SHARP. 



•mates as independent observations, respectively. 
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REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 



REFINEMENT. 
PROGRAM 
AUTHORS 



CNS 1.1 

BRUNGER, ADAMS, CLORE, DELANO, 
GROS, GROSSE-KUNSTLEVE, JIANG, 
KUSZEWSKI, NILGES, PANNU, READ, 
RICE, SIMONSON, WARREN 



DATA USED IN REFINEMENT. 
RESOLUTION RANGE HIGH (ANGSTROMS) 
RESOLUTION RANGE LOW (ANGSTROMS) 
DATA CUTOFF ( SIGMA (F) ) 

DATA CUTOFF HIGH (ABS(F)) 
DATA CUTOFF LOW (ABS (F) ) 

COMPLETENESS (WORKING+TEST) (%) 
NUMBER CF REFLECTIONS 

FIT TO DATA USED IN REFINEMENT. 

CROSS-VALIDATION METHOD : 

FREE R VALUE TEST SET SELECTION : 

R VALUE (WORKING SET) : 

FREE R VALUE : 

FREE R VALUE TEST SET SIZE (%) : 

FREE R VALUE TEST SET COUNT : 

ESTIMATED ERROR OF FREE R VALUE : 

FIT IN THE HIGHEST RESOLUTION BIN. 
TOTAL NUMBER OF BINS USED 
BIN RESOLUTION RANGE HIGH (A) 
BIN RESOLUTION RANGE LOW (A) 
BIN COMPLETENESS (WORKING+TEST) (%) 
REFLECTIONS IN BIN (WORKING SET) 

BIN R VALUE (WORKING SET) 

BIN FREE R VALUE 

BIN FREE R VALUE TEST SET SIZE (%) 
BIN FREE R VALUE TEST SET COUNT 
ESTIMATED ERROR OF BIN FREE R VALUE 



NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT. 
PROTEIN ATOMS 
NUCLEIC ACID ATOMS 
HETEROGEN ATOMS 
SOLVENT ATOMS 



2.25 

38 .17 
0.0 
1695436.78 
0.000000 
99.7 
48822 



THROUGHOUT 
RANDOM 
0 .227 
0.258 
5.2 
2523 
0.005 



2.39 
100.0 

7694 
0.295 
0.350 

5.2 



REMARK 


3 


MEAN B VALUE 


(OVERALL, 


REMARK 


3 


OVERALL ANISOTROPIC B VALUE 


REMARK 


3 


Bll (A**2) 


-7.73 


REMARK 
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REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 

REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 
REMARK 



REMARK 
REMARK 
REMARK 
REMARK 
REMARK 



BULK SOLVENT MODELING. 



METHOD USED 

KSOL 

BSOL 



FLAT MODEL 
0 .327326 
55.1711 (A* 



'2) 



3 ESTIMATED COORDINATE ERROR. 

3 ESD FROM LUZZATI PLOT (A) : 0.32 

3 ESD FROM SIGMAA (A) : 0.27 

3 LOW RESOLUTION CUTOFF (A) : 5.00 

3 

3 CROSS -VALIDATED ESTIMATED COORDINATE ERROR. 



ESD FROM C-V LUZZATI PLOT 
ESD FROM C-V SIGMAA 



(A) 



RMS DEVIATIONS FROM IDEAL VALUES. 
BOND LENGTHS (A) 
BOND ANGLES (DEGREES) 
DIHEDRAL ANGLES (DEGREES) 
IMPROPER ANGLES (DEGREES) 



0.007 
1.2 
22 .7 
0.76 



3 ISOTROPIC THERMAL MODEL : GROUP 
3 

3 ISOTROPIC THERMAL FACTOR RESTRAINTS. 



3 MAIN- CHAIN BOND 
3 MAIN-CHAIN ANGLE 
3 SIDE-CHAIN BOND 
3 SIDE- CHAIN ANGLE 
3 

3 NCS MODEL : NONE 
3 

3 NCS RESTRAINTS. 

3 GROUP 1 POSITIONAL 

3 GROUP 1 B- FACTOR 



(A**2) 
(A**2) 
(A**2) 
(A**2) 



(A**2) 



RMS 
; NULL 
: NULL 
; NULL 
: NULL 



NULL 
NULL 



SIGMA 
NULL 
NULL 
NULL 
NULL 



SIGMA/ WEIGHT 
; NULL 
; NULL 



PARAMETER FILE 
PARAMETER FILE 
TOPOLOGY FILE : 
TOPOLOGY FILE : 



CNS_TOPPAR/protein_rep .param 
CNS_TOPPAR/water_rep . param 
CNS_TOPPAR/protein . top 
CNS_TOPPAR/water . top 



OTHER REFINEMENT REMARKS: NULL 



SEQRES 1 A 713 SER MSE LYS ALA ILE VAL VAL ILE ASN LEU VAL LYS ILE 

SEQRES 2 A 713 ASN LYS LYS ILE ILE PRO ASP LYS ILE TYR VAL TYR ARG 

SEQRES 3 A 713 LEU TYR SER ILE TYR ARG LEU ALA TYR GLU ASN VAL GLY 

SEQRES 4 A 713 ILE VAL ILE ASP PRO GLU ASN LEU ILE ILE ALA THR THR 

SEQRES 5 A 713 LYS GLU LEU GLU TYR GLU GLY GLU PHE ILE PRO GLU GLY 

SEQRES 6 A 713 GLU ILE SER PHE SER GLU LEU ARG ASN ASP TYR GLN SER 

SEQRES 7 A 713 LYS LEU VAL LEU ARG LEU LEU LYS GLU ASN GLY ILE GLY 

SEQRES 8 A 713 GLU TYR GLU LEU SER LYS LEU LEU ARG LYS PHE ARG LYS 

SEQRES 9 A 713 PRO LYS THR PHE GLY ASP TYR LYS VAL ILE PRO SER VAL 

SEQRES 10 A 713 GLU MSE SER VAL ILE LYS HIS ASP GLU ASP PHE TYR LEU 

SEQRES 11 A 713 VAL ILE HIS ILE ILE HIS GLN ILE GLN SER MSE LYS THR 

SEQRES 12 A 713 LEU TRP GLU LEU VAL ASN LYS ASP PRO LYS GLU LEU GLU 

SEQRES 13 A 713 GLU PHE LEU MSE THR HIS LYS GLU ASN LEU MSE LEU LYS 

SEQRES 14 A 713 ASP ILE ALA SER PRO LEU LYS THR VAL TYR LYS PRO CYS 
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SEQRES 15 A 

SEQRES 16 A 

SEQRES 17 A 

SEQRES 18 A 

SEQRES 19 A 

SEQRES 20 A 

SEQRES 21 A 

SEQRES 22 A 



SEQRES 
SEQRES 



27 A 

28 A 

29 A 

30 A 

31 A 

32 A 

33 A 

34 A 

35 A 

36 A 

37 A 

38 A 

39 A 

40 A 

41 A 

42 A 

43 A 

44 A 

45 A 

46 A 

47 A 

48 A 

49 A 

50 A 

51 A 

52 A 

53 A 



SEQRES 

SEQRES 

CRYST1 

0RIGX1 

ORIGX2 

ORIGX3 

SCALE 1 

SCALE2 

SCALE 3 

ATOM 

ATOM 

ATOM 

ATOM 

ATOM 

ATOM 

ATOM 

ATOM 



713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 

713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 
713 



PHE GLU 
GLN GLU 
ARG TYR 
ARG LYS 
LEU ALA 
LEU PRO 
LEU ALA 
GLU GLU 
VAL ASP 
GLU VAL 
ARG VAL 
GLN LEU 



GLU TYR 
ILE VAL 
TRP ASN 
PHE GLY 
LYS PHE 
GLN LEU 
LYS GLU 
ARG LYS 
SER ASP 
GLU LYS 
ARG ASP 
LEU TRP 



THR LYS 
LYS TYR 
THR PRO 
GLN VAL 
ALA SER 
VAL VAL 
ILE LEU 
GLU LEU 
ILE ILE 
ILE ALA 
ASP LYS 
THR ASN 



LYS PRO 
TRP TYR 
GLU ALA 
ASP LEU 
LYS ASN 
PRO THR 
GLU TYR 
LEU GLU 
ASP LYS 
GLN GLU 
GLY ASN 
TYR SER 



LEU ASP 
TYR HIS 
LEU GLU 
GLN PRO 
LYS ILE 
ASN ALA 
LYS LEU 
ILE LEU 
LEU SER 
GLU ASN 
VAL PRO 
LYS TYR 



HIS ASN 
ILE GLU 
PHE TYR 
ALA ILE 
TYR LEU 
GLU GLN 
MSE PRO 
ALA GLU 
GLU ILE 
LYS ILE 
ILE SER 
PRO VAL 



ILE LEU PRO TYR GLU VAL PRO GLU LYS PHE ARG LYS ILE 
ARG GLU ILE PRO MSE PHE ILE ILE LEU ASP SER GLY LEU 
LEU ALA ASP ILE GLN ASN PHE ALA THR ASN GLU PHE ARG 
GLU LEU VAL LYS SER MSE TYR TYR GLU LYS VAL ILE THR 
GLU ASP LEU ASN SER ASP LYS GLY ILE ILE GLU VAL VAL 
GLU GLN VAL SER SER PHE MSE LYS GLY LYS GLU LEU GLY 
LEU ALA PHE ILE ALA ALA ARG ASN LYS LEU SER SER GLU 
LYS PHE GLU GLU ILE LYS ARG ARG LEU PHE ASN LEU ASN 
VAL ILE SER GLN VAL VAL ASN GLU ASP THR LEU LYS ASN 
LYS ARG ASP LYS TYR ASP ARG ASN ARG LEU ASP LEU PHE 
VAL ARG HIS ASN LEU LEU PHE GLN VAL LEU SER LYS LEU 
GLY VAL LYS TYR TYR VAL LEU ASP TYR ARG PHE ASN TYR 
ASP TYR ILE ILE GLY ILE ASP VAL ALA PRO MSE LYS ARG 
SER GLU GLY TYR ILE GLY GLY SER ALA VAL MSE PHE ASP 
SER GLN GLY TYR ILE ARG LYS ILE VAL PRO ILE LYS ILE 
GLY GLU GLN ARG GLY GLU SER VAL ASP MSE ASN GLU PHE 
PHE LYS GLU MSE VAL ASP LYS PHE LYS GLU PHE ASN ILE 
LYS LEU ASP ASN LYS LYS ILE LEU LEU LEU ARG ASP GLY 
ARG ILE THR ASN ASN GLU GLU GLU GLY LEU LYS TYR ILE 
SER GLU MSE PHE ASP ILE GLU VAL VAL THR MSE ASP VAL 
ILE LYS ASN HIS PRO VAL ARG ALA PHE ALA ASN MSE LYS 
MSE TYR PHE ASN LEU GLY GLY ALA ILE TYR LEU ILE PRO 
HIS LYS LEU LYS GLN ALA LYS GLY THR PRO ILE PRO ILE 
LYS LEU ALA LYS LYS ARG ILE ILE LYS ASN GLY LYS VAL 
GLU LYS GLN SER ILE THR ARG GLN ASP VAL LEU ASP ILE 
PHE ILE LEU THR ARG LEU ASN TYR GLY SER ILE SER ALA 
ASP MSE ARG LEU PRO ALA PRO VAL HIS TYR ALA HIS LYS 
PHE ALA ASN ALA ILE ARG ASN GLU TRP LYS ILE LYS GLU 
GLU PHE LEU ALA GLU GLY PHE LEU TYR PHE VAL 



69.726 104.188 74.015 90.00 102.: 



1. 000000 
0 . 000000 
0 . 000000 
0 . 014342 
0 .000000 
0 .000000 
CB SER A 



0.000000 
1.000000 
0.000000 
0.000000 
0.009598 
0.000000 



0 .000000 
0.000000 
1.000000 
0.003265 
0.000000 
0.013857 
-9.237 31.412 



OG 



SER A 
C SER A 
O SER A 
N SER A 
CA SER A 



-9.737 
-10 .483 
-10.625 
-8 .485 
-9.110 
-11 .494 
-12 . 873 



30.155 
32.911 
34.075 
31.638 
32.356 
32.061 
32.436 



90.00 P 21 
0 . 00000 
0.00000 
0.00000 
0 . 00000 
0.00000 
0.00000 
15.695 
16.110 
17.228 
17.612 
18.031 
16. 883 
17.086 
17.365 



1.00 52.89 
1.00 52.89 
1.00 79.00 
1.00 79.00 
1.00 79.00 
1.00 79.00 
1.00 80.16 
1.00 80.16 
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ATOM 
ATOM 
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ATOM 
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ATOM 
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ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
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ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



3 3 CGI 
34 CD1 



CG2 
CGI 
CD1 



MSE A 
MSE A 
MSE A 
MSE A 
MSE A 
MSE A 
LYS A 
LYS A 
LYS A 
LYS A 
LYS A 
LYS A 
LYS A 
LYS A 
LYS A 
ALA A 
ALA A 
ALA A 
ALA A 

ALA A 
ILE A 
ILE A 
ILE A 
2 ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
VAL A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ASN A 
ASN A 
ASN A 
ASN A 
ASN A 
ASN A 



-13.792 
-15.277 
-15.681 
-16.904 
-13.266 
-12.455 
-14.514 
-15.048 
-15.945 
-15.274 
-16.202 
-15.521 
-16.416 
-15.860 
-16.065 
-16.313 
-17.103 
-16 .203 
-17.859 

-17 .505 
-18.921 
-19.728 
-21.239 
-22.021 
-21.545 
-21.149 
-19.433 
-19.513 
-19.075 
-18 .775 
-17.244 
-16.454 
-16.802 
-19.536 
-19.790 
-19.922 
-20.622 
-21.432 
-22.045 
-22 .532 
-19.549 
-18 .499 
-19. 802 
-18 . 856 
-18.278 
-17.298 
-19.418 
-19.007 
-19.595 
-20.830 
-18 .847 
-19.443 
-18.456 
-17 .200 
-16.866 
-16.487 



31.773 
32.075 
33.960 
33 .999 
32.013 
31.466 
32.287 
31.929 
33.044 
34.401 
35.452 
36.806 
37. 829 
30. 656 
30.196 
30.076 
28.864 
27.673 
28.681 

29.269 
27 . 885 
27.632 
27.735 
27.720 
29 . 034 
30.289 
26.223 
25.259 
26.103 
24.794 
24.619 
25.009 
25.455 
24.607 
25.565 
23 .369 
23.041 
21.730 
21.327 
21.925 
22.824 
22 .236 
23.308 
23.138 
24.496 
25.030 
25.475 
26.725 
22.476 
22 .460 
21.949 
21.253 
20 .201 



16.330 
16.475 
16.497 
17 . 989 
18.780 
19.529 
19.141 
20.442 
20 . 970 
21.005 
21.576 
21.662 
22.282 
20.264 
19.139 
21.370 
21.295 
21.016 
22.597 

23 .618 
22.552 
23.736 
23 .437 
24 . 743 
22.677 
23.415 
24.218 
23.454 
25.489 
26.033 
26.297 
25.056 
27.483 
27.328 
28.051 
27.597 
28.824 
28.673 
30.019 
27.614 
29.882 
29.591 
31. 095 
32.197 
32 . 676 
31.642 
32.948 
33.697 
33 .346 
33.353 
34.318 
35.460 
36.002 
36.637 
36.429 
37.404 



1.00148.49 
1.00148.49 
1.00148.49 
1.00148.49 
1.00 80.16 
1.00 80.16 
1.00 61.75 
1.00 61.75 
1.00 71.24 
1.00 71.24 
1.00 71.24 
1.00 71.24 
1.00 71.24 
1.00 61.75 
1.00 61.75 
1.00 49.57 
1.00 49.57 
1.00 38.38 
1 . 00 49 . 57 

1.00 49.57 
1.00 48.74 
1.00 48 . 74 
1.00 52.34 
1.00 52.34 
1.00 52.34 
1.00 52.34 
1.00 48.74 
1.00 48.74 
1.00 41.68 
1.00 41 . 68 
1.00 47.32 
1.00 47.32 
1.00 47.32 
1.00 41.68 
1.00 41.68 
1.00 42 . 89 
1.00 42 . 89 
1.00 36.57 
1.00 36.57 
1.00 36.57 
1.00 42.89 
1.00 42.89 
1.00 41.08 
1.00 41.08 
1.00 37.86 
1.00 37.86 
1.00 37.86 
1.00 37.86 
1.00 41.08 
1.00 41.08 
1.00 40.94 
1.00 40.94 
1.00 34.10 
1.00 34.10 
1.00 34.10 
1.00 34.10 
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ATOM 
ATOM 
ATOM 
ATOM 
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100 
101 
102 
103 
104 
10B 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 



C ASN A 
O ASN A 
N LEU A 
CA LEU A 
CB LEU A 
CG LEU A 
CD1 LEU A 
CD 2 LEU A 
C LEU A 
O LEU A 
N VAL A 
CA VAL A 
CB VAL A 
CGI VAL A 
CG2 VAL A 
C VAL A 
O VAL A 
N LYS A 
CA LYS A 
CB LYS A 
CG LYS A 
CD LYS A 
CE LYS A 
NZ LYS A 
C LYS A 
O LYS A 
N ILE A 
CA ILE A 
CB ILE A 
CG2 ILE A 
CGI ILE A 
CD1 ILE A 
C ILE A 
O ILE A 
N ASN A 
CA ASN A 
CB ASN A 
CG ASN A 
OD1 ASN A 
ND2 ASN A 
C ASN A 
O ASN A 
N LYS A 
CA LYS A 
CB LYS A 
CG LYS A 
CD LYS A 
CE LYS A 
NZ LYS A 
C LYS A 
O LYS A 
N LYS A 
CA LYS A 
CB LYS A 
CG LYS A 
CD LYS A 
CE LYS A 



-19.989 
-19.796 
-20.689 
-21.307 
-20.774 
-19.433 
-18 .292 
-19.275 
-22 . 816 
-23 . 280 
-23.570 
-25.011 
-25.728 
-25.372 
-27.219 
-25.409 
-25 . 005 
-26.191 
-26 .634 
-27.457 
-27.732 
-28 .623 
-29.030 
-30.076 
-27.444 
-28.290 
-27 .181 
-27.895 
-26.921 
-27 . 682 
-25.846 
-24.656 
-28.986 
-28 .744 
-30.177 
-31.263 
-32.602 
-33 .781 
-34 .493 
-33 .981 
-30 .993 
-30.650 
-31.177 
-30.930 
-31.128 
-32 .582 
-32 .700 
-34.152 
-34.268 
-31.802 
-31.497 
-32.883 
-33.751 
-35.048 
-34.907 
-36.283 
-36.210 



22.120 
21.815 
23 .190 
24.102 
25.537 
25.989 
25.228 
27 .487 
24.136 
23.929 
24.400 
24.503 
23.223 
22.949 
23.389 
25.702 
25 . 821 
26.595 
27.794 
28.667 
30.067 
30.840 
32 . 183 
32.817 
27.494 
26.604 
28.252 
28.081 
28 .253 
28 .230 
27 . 160 
27.420 
29.157 
30.311 
28 .793 
29.771 
29.078 
30.010 
29.907 
30.938 
30.751 
30.336 
32.043 
33.087 
34.473 
34 . 855 
36.281 
36.663 
38.033 
32.972 
33 .563 
32.213 
32.069 
31.347 
29.861 



36.602 
37 . 781 
36.239 
37.198 
37.038 
37.614 
36.968 
37.387 
36.948 
35.821 
38.009 
37.921 
38 .435 
39.880 
38 .298 
38 . 770 
39.935 
38.175 
38.862 
37.913 
38.450 
37.503 
38.076 
37.209 
40.119 
40.140 
41.171 
42.433 
43.618 
44.927 
43 .563 
44.491 
42 . 509 
42.152 
42.968 
43 . 061 
43.307 

43 . 089 
42.090 
44.014 

44 .201 
45.305 
43 .932 
44.921 
44.300 
44.063 
43 .541 
43.289 
42.713 
46.167 
47.201 
46.084 
47.240 
46.856 
46.563 
46.347 
46.172 



1.00 40.94 
1.00 40.94 
1.00 39.65 
1.00 39.65 
1.00 33.69 
1.00 33.69 
1.00 33.69 
1.00 33.69 
1.00 39.65 
1.00 39.65 
1.00 43.35 
1.00 43.35 
1.00 52.91 
1.00 52.91 
1.00 52.91 
1.00 43.35 
1.00 43.35 
1.00 45.06 
1.00 45.06 
1.00 61.94 
1.00 61.94 
1.00 61.94 
1.00 61.94 
1.00 61.94 
1 . 00 45 . 06 
1.00 45.06 
1.00 38.81 
1.00 38.81 
1.00 40 . 09 
1 . 00 40 . 09 
1.00 40 . 09 
1.00 40.09 
1.00 38.81 
1.00 38.81 
1.00 47.45 
1.00 47.45 
1.00 60 . 49 
1.00 60.49 
1.00 60.49 
1.00 60.49 
1.00 47.45 
1.00 47.45 
1.00 50.64 
1.00 50.64 
1.00 99.82 
1.00 99.82 
1.00 99.82 
1.00 99.82 
1.00 99.82 
1.00 50.64 
1.00 50.64 
1.00 59.16 
1.00 59.16 
1.00 77.44 
1.00 77.44 
1.00 77.44 
1.00 77.44 



FIGURE 25 CON'T 
Page 5 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 

ATOM 
ATOM 
ATOM 
ATOM 
ATOM . 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



125 
126 
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130 
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133 
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135 
136 
137 
138 
139 
14 0 
141 
142 
143 
144 
14 5 
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147 
148 
149 
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151 
152 
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158 
159 
160 
161 
162 
163 
164 
165 
166 
167 
168 
169 
170 
171 
172 
173 
174 
175 
176 
177 



LYS A 
LYS A 
LYS A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
PRO A 
PRO A 
PRO A 
PRO A 
PRO A 
PRO A 
PRO A 
ASP A 
ASP A 
ASP A 
ASP A 
ASP A 
ASP A 
ASP A 
ASP A 



N LYS i 

CA LYS ; 

CB LYS i 

CG LYS j 



NZ LYS P. 
C LYS P. 
O LYS P. 
N ILE P 
CA ILE P 
CB ILE P 
CG2 ILE P 
CGI ILE P 
CD1 ILE P 
C ILE I 
O ILE I 
N TYR I 
CA TYR I 
CB TYR I 
CG TYR ; 
CD1 TYR 7 



-37.577 
-33 . 042 
-33.580 
-31.834 
-31.089 
-30.008 
-28.900 
-29.429 
-28 .387 
-30.413 
-30.199 
-30.084 
-29.440 
-29.332 
-28.759 
-28 .445 
-28.312 
-30 .251 
-31.474 
-29.573 
-28.105 
-30 .211 
-29.136 
-27.889 
-30.654 
-29.958 
-31.812 
-32.359 
-33 .726 
-34 . 729 
-35.052 
-35.191 
-31.410 
-31.004 

-31.053 
-30.150 
-30.921 
-31.956 
-33 .246 
-34.345 
-34 .762 
-28 . 948 
-29.064 
-27 . 793 
-26.539 
-25 .423 
-24 .108 
-25.824 
-24.886 
-26.128 
-26.275 
-25.614 
-25.198 
-25.946 
-27.447 
-28.228 



27.136 
31.316 
31.178 
30.825 
30.105 
29.171 
29.990 
28 .281 
27.287 
31.101 
30.817 
32.274 
33.314 
34.611 
35 . 748 
34.335 



35 .487 
33 .522 
33 .669 
33.517 
33 .488 
33.688 
33.201 
33.752 
35.112 
36.074 
35.229 
36.525 
36 .344 
35.656 
36.221 
34.549 
37 .244 
38 .379 

36.580 
37 . 176 
37.581 
36.578 
36.672 
35.798 
36.244 
36.315 
35.109 
36.966 
36.323 
36.637 
36.017 
36.121 
36.563 
36.891 
38.087 
36.037 
36.469 
35. 663 
35.661 
36.741 



46.125 
48.365 
49.457 
48 . 095 
49.123 
48.513 
47.868 
49.614 
49.147 
50.062 
51.238 
49.531 
50.317 
49.481 
50.319 
48.256 
47.310 
51. 600 
51.562 
52 . 759 
52 . 895 
54. 067 
55.030 
54 .392 
54.376 
54.055 
55.016 
55.389 
56.053 
55.146 
54.080 
55.498 
56.345 
56.088 

57.442 
58 .424 
59.690 
50.185 
59.382 
59.972 
61.333 
58.814 
59.018 
58.907 
59.282 
58.251 
58.699 
56.866 
55. 756 
60.644 
60.890 
61.522 
62.854 
63 . 916 
63 .739 
64.161 
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19B 
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208 
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214 
215 
216 
217 
218 
219 
220 
221 
222 
223 
224 
225 
226 
227 
228 
229 
230 
231 
232 
233 
234 



CE1 TYR P 
CD 2 TYR P. 
CE2 TYR P 
CZ TYR P 
OH TYR P 
C TYR P 
O TYR P 
N VAL P 
CA VAL P 
CB VAL I 
CGI VAL I 
CG2 VAL I 
C VAL I 
O VAL ; 
N TYR 1 
CA TYR I 
CB TYR 2 
CG TYR ; 
CD1 TYR 1 
CE1 TYR 1 
CD2 TYR 2 
CE2 TYR 2 
CZ TYR 2 



OH 



N 



TYR 2 
TYR 2 
TYR 2 



CA ARG P 
CB ARG P 
CG ARG P 
CD ARG P 
NE ARG P 
CZ ARG P 
NH1 ARG I 
NH2 ARG P 
C ARG P 
O ARG I 
N LEU / 
CA LEU 2 
CB LEU 1 
CG LEU I 
CD1 LEU 2 
CD 2 LEU 2 
C LEU 2 
O LEU 1 
N TYR ; 
CA TYR 2 
CB TYR 2 
CG TYR i 
CD1 TYR 2 
CE1 TYR 2 
CD 2 TYR 2 
CE2 TYR 2 
CZ TYR 2 
OH TYR 2 
C TYR l 
O TYR J 



-29.616 
-28.089 
-29.469 
-30.227 
-31.590 
-23.694 
-23.164 
-23.010 
-21.569 
-20.882 
-19.374 
-21.217 
-21.297 
-21.990 
-20 .293 
-19.922 
-20.523 
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-22 .788 
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-18 .171 
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34.585 
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35.631 
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37 .095 
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35.069 
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40.895 
35.380 
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N ILE 1 
CA ILE ; 
CB ILE 1 
CG2 ILE J 
CGI ILE 1 
CD1 ILE 2 
C ILE 2 
O ILE 1 
N TYR 2 
CA TYR 2 



CD1 TYR I 
CE1 TYR 2 
CD2 TYR 2 
CE2 TYR 1 
CZ TYR 1 
OH TYR 2 
C TYR 1 
O TYR 2 
N ARG 2 
CA ARG 2 
CB ARG 2 



NE ARG I 
CZ ARG 1 
NH1 ARG 1 
NH2 ARG I 
C ARG I 
O ARG 1 
N LEU 2 
CA LEU ; 
CB LEU 1 
CG LEU 1 
CD1 LEU 1 
CD 2 LEU 2 
C LEU 2 
O LEU 2 
N ALA i 
CA ALA 2 



O ALA A 
N TYR A 
CA TYR A 
CB TYR A 
CG TYR A 
CD1 TYR A 
CE1 TYR A 



-14.594 
-13 .726 
-12.844 
-13.623 
-14.564 
-15.795 
-13 .885 
-14.556 
-13 .568 
-12 .487 
-14.319 
-13 .425 
-15.194 
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-14.621 
-15.133 
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-13 .024 
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21.772 
22 . 883 
22.493 
22.334 
24.094 
24.045 
25.176 
26.407 
27 . 590 
27 . 320 
28 .877 
30.082 
26.236 
26.903 
25.342 
25.058 
24.330 
25.237 
26.036 
26.862 
25.286 
26.107 
26.892 
27.703 
24.196 
24.360 
23 .278 
22 .393 
21.203 
20.305 
19.134 
18.315 
17.245 
16.857 
16.561 
23.165 
22.869 
24.150 
24.967 
25.953 
26.990 
26.291 
27.988 
25.729 
25.618 
26.493 
27.276 
27.876 
26.398 

26.812 
25.177 
24.219 
22.985 
21.954 
21.998 
21.090 
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66.910 
65.722 
64.548 
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65.747 
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64.657 
65.353 
65.224 
64.372 
64. 047 
63.570 
62.235 
61.408 
60.793 
59.692 
59.107 
61 . 303 
60.727 
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64.650 
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65.825 
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330 
331 
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335 
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344 
345 
346 
347 



CD2 TYR A < 
CE2 TYR A < 
CZ TYR A i. 
OH TYR A < 
C TYR A 
O TYR A '. 
N GLU A ' 
CA GLU A ' 
CB GLU A < 
CG GLU A ' 
CD GLU A < 
OE1 GLU A - 
OE2 GLU A ■ 
C GLU A • 
O GLU A • 
N ASN A • 
CA ASN A 
CB ASN A 
CG ASN A 
OD1 ASN A 
ND2 ASN A 
C ASN A 
O ASN A 
N VAL A 
CA VAL A 
CB VAL A 
CGI VAL A 
CG2 VAL A 
C VAL A 
O VAL A 
N GLY A 
CA GLY A 
C GLY A 
O GLY A 
N ILE A 
CA ILE A 
CB ILE A 
CG2 ILE A 
CGI ILE A 
CD1 ILE A 
C ILE A 
O ILE A 
N VAL A 
CA VAL A 
CB VAL A 
CGI VAL A 
CG2 VAL A 
C VAL A 
O VAL A 
N ILE A 
CA ILE A 
CB ILE A 
CG2 ILE A 
CGI ILE A 
CD1 ILE A 
C ILE A 
O ILE A 



-21.675 

-22.370 

-22.250 

-22.933 

-22 .341 

-23 .380 

-22.283 

-23 .469 

-23 . 085 

-22.106 

-21.873 

-21.515 

-22 . 043 

-24.540 

-25.725 

-24 . 120 

-25 . 059 

-24 .462 

-24.410 

-25 .445 

-23 .203 

-25.415 

-25.897 

-25.188 

-25.479 

-26.996 

-27.275 

-27.754 

-24.791 

-25.437 

-23 .471 

-22.699 

-21.288 

-20.981 

-20.435 

-19.049 

-18.829 
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-15.767 
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22.884 
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19.324 
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23.661 
25.222 
26.333 
27.458 
27.102 
26.991 
26.917 
26.854 
27 . 975 
26.026 
26.409 
26.575 
26.655 
25.410 

27 . 734 

28 .729 
27 .734 
28 . 932 
28.562 
27.379 
29.570 
29.365 
29.759 
28.727 
31.134 
32.291 
30.157 
30.830 
30.069 
30.772 
29.844 
30.618 
28.659 
31.967 
31.810 
33 .157 
34.381 
35.608 
36.879 
35.453 
36.623 
34.668 
34.796 
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58 . 092 
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63 .575 
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63 .524 
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65.950 
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62.058 
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58.075 
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59.187 
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57.890 

59.622 

58.956 

59.445 

58.828 

59.077 

59.513 

59.177 

58.221 



1.00 63.27 
1.00 63.27 
1.00 63.27 
1.00 63.27 
1.00 63.28 
1.00 63.28 
1.00 72.26 
1.00 72.26 
1.00120.38 
1.00120.38 
1.00120.38 
1.00120.38 
1.00120.38 
1.00 72.26 
1.00 72.26 
1.00 83.72 
1.00 83.72 
1.00 71.59 
1.00 71.59 
1.00 71.59 
1.00 71.59 
1.00 83.72 
1.00 83.72 
1.00 69.39 
1.00 69.39 
1.00 62.36 
1.00 62.36 
1.00 62.36 
1.00 69.39 
1.00 69.39 
1.00 52 . 70 
1.00 52.70 
1.00 52.70 
1.00 52 .70 
1.00 58.71 
1.00 58.71 
1.00 69.06 
1.00 69.06 
1.00 69.06 
1.00 69.06 
1.00 58.71 
1.00 58.71 
1.00 74.42 
1.00 74.42 
1.00 52.30 
1.00 62.30 
1.00 62.30 
1.00 74.42 
1.00 74.42 
1.00 97.43 
1.00 97.43 
1 . 00108 .88 
1 . 00108 .88 
1 . 00108 . 88 
1.00108.88 
1.00 97.43 
1.00 97.43 



FIGURE 25 CON'T 
Page9of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



348 
349 
350 
351 
352 
353 
354 
355 
356 
357 
358 
359 
360 
361 
362 
363 
364 
365 
366 
367 
368 
369 
370 
371 
372 
373 
374 
375 
376 
377 
378 
379 
380 
381 
382 
383 
384 
385 
386 
38 7 
388 
389 
390 
391 
392 
393 
394 
3 95 
396 
397 



N ASP 1 
CA ASP 1 
CB ASP 1 
CG ASP 1 
OD1 ASP 1 
OD2 ASP 1 
C ASP 2 

o asp ; 

N PRO 1 
CD PRO 1 
CA PRO i 
CB PRO 2 
PRO i 



CG 



O PRO A 55 

N GLU A 56 

CA GLU A 56 

CB GLU A 56 

CG GLU A 56 

CD GLU A 56 

OE1 GLU A 56 

OE2 GLU A 56 

C GLU A 56 

O GLU A 56 

N ASN A 57 

CA ASN A 57 

CB ASN A 57 

CG ASN A 57 

OD1 ASN A 57 

ND2 ASN A 57 

C ASN A 57 

O ASN A 57 

N LEU A 58 

CA LEU A 58 

CB LEU A 58 

CG LEU A 58 

CD1 LEU A 58 

CD2 LEU A 58 

C LEU A 58 

O LEU A 5 8 

N ILE A 59 

CA ILE A 59 

CB ILE A 59 

CG2 ILE A 59 

CGI ILE A 5 9 

CD1 ILE A 59 

C ILE A 59 

O ILE A 59 

N ILE A 60 

CA ILE A 60 

CB ILE A 60 

CG2 ILE A 60 

CGI ILE A 60 

CD1 ILE A 60 

C ILE A 60 

O ILE A 60 

N ALA A 61 



-13 .024 
-11.635 
-11.544 
-10.156 
-9. 165 
-10.058 
-11.079 
-11.320 
-10.332 
-10.006 
-9.742 
-8.938 
-9.698 
-8 . 856 
-8.745 
-8 .225 
-7 . 348 
-6 .589 
-6.240 
-5.482 
-4.412 
-5.958 
-8.185 
-8.154 
-8.941 
-9.791 
-10 .2 04 
-9.010 
-8.276 
-8.806 
-11.037 
-11.990 
-11.024 
-12.148 
-12.103 
-10.853 
-10.982 
-10.675 
-13 .510 
-14.413 
-13 .654 
-14.907 
-14.642 
-15.963 
-13 .800 
-13 .414 
-15.741 
-15 .301 
-16.945 
-17.840 
-18 .218 
-16.955 
-19.089 
-19.587 
-19.131 
-19.567 
-19.732 



34.778 
35.058 
36.381 
36.999 
36.277 
38.212 
33.917 
33.861 
32 . 988 
32.935 
31.845 
31.145 
31.474 
32.298 
31.615 
33.455 
34.009 
35.221 
35.137 
33.875 
33 .630 
33 . 128 
34.449 
33 .829 
35.524 
36 . 098 
37 . 512 
38.419 
38 .767 
38 .800 
35 .270 
35.782 
33 .998 
33 . 096 
32.589 
31.837 
31.496 
30.571 
33 .748 
33 .681 
34.375 
35.034 
36.384 
37.062 
37.275 
38.605 
34.160 
33 .806 
33 .812 
32.978 
31.689 
30.962 
32.031 
30.816 
33.706 
34.619 
33 .289 



60.441 
60.785 
61.555 
61.499 
61.745 
61.214 
51.641 
62.848 
61.020 
59.584 
61.725 
60.632 
59.387 
62.886 
63 . 906 
62 .715 
63.738 
63 .187 
61.708 
61.350 
61.946 
60.469 
64.934 
65.997 
64.734 
65.768 
65.353 
65.105 
66.032 
63 . 848 
66.085 
66.673 
65.698 
65. 946 
67.392 
67.851 
69.327 
67. 027 
65.683 
66.520 
64 . 520 
64.159 
63.466 
63 . 121 
64 .384 
63 .770 
63 .230 
62.141 
63 .671 
62.879 
63.639 
64.093 
64.845 
65.591 
62.508 
63.211 
61.397 



1.00 84.01 
1.00 84.01 
1.00 79.35 
1.00 79.35 
1.00 79.35 
1.00 79.35 
1.00 84.01 
1.00 84.01 
1.00 99.05 
1.00107.30 
1.00 99.05 
1.00107.30 
1.00107.30 
1.00 99.05 
1.00 99.05 
1.00113 .58 
1.00113.58 
1.00129.86 
1.00129.86 
1.00129.86 
1. 00129.86 
1.00129.86 
1.00113.58 
1.00113.58 
1.00 99.99 
1.00 99.99 
1.00110.57 
1.00110.57 
1.00110.57 
1.00110.57 
1.00 99.99 
1.00 99.99 
1.00 78 . 90 
1.00 78 . 90 
1.00 92.24 
1.00 92.24 
1.00 92.24 
1.00 92.24 
1.00 78.90 
1.00 78.90 
1.00 74.85 
1.00 74.85 
1.00 89.43 
1.00 89.43 
1.00 89.43 
1.00 89.43 
1.00 74.85 
1.00 74.85 
1.00 64.45 
1.00 64.45 
1.00 62.85 
1.00 62.85 
1.00 62.85 
1.00 62.85 
1.00 64.45 
1.00 64.45 
1.00 67.87 



FIGURE 25 CON'T 
Page 10 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 

ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



405 
406 
407 
408 
409 
410 
411 

412 
413 
414 
415 
416 
417 
418 
419 
420 
421 
422 
423 
424 
425 



CA ALA A 61 

CB ALA A 61 

C ALA A 61 

O ALA A 61 

N THR A 62 

CA THR A 62 

CB THR A 62 

OG1 THR A 62 

CG2 THR A 62 

C THR A 62 

O THR A 62 

N THR A 63 

CA THR A 63 

CB THR A 63 

OG1 THR A 63 

CG2 THR A 63 

C THR A 63 

O THR A 63 

N LYS A 64 

CA LYS A 64 

CB LYS A 64 

CG . LYS A 64 

CD LYS A 64 

CE LYS A 64 



429 
430 
431 
432 
433 
434 
435 
436 
437 
438 
439 
440 
441 
442 
443 
444 
445 
446 
447 
448 
449 
450 
451 
452 
453 
454 
455 
456 
457 
458 



O LYS i 
N GLU i 
CA GLU 1 



CD GLU ? 
OE1 GLU 2 
OE2 GLU P 
C GLU ? 
O GLU ? 
N LEU ? 
CA LEU I 
CB LEU f 
CG LEU I 
CD1 LEU 7 
CD 2 LEU I 
C LEU I 
O LEU I 
N GLU I 
CA GLU 1 
CB GLU 1 
CG GLU 1 
CD GLU 1 
OE1 GLU 1 
OE2 GLU 1 
C GLU 1 
O GLU I 
N TYR 2 
CA TYR 1 
CB TYR 1 



-20 . 981 
-20.760 
-21.990 
-21.765 
-23 .104 
-24.134 
-24.341 

-23 .140 
-25.483 
-25.467 
-25.749 
-26.282 
-27.588 
-27.928 
-28.085 
-26.835 
-28.669 
-29.852 
-28.253 
-29.181 
-29.253 
-29.890 
-31.329 
-32 .007 
-32.095 
-28.763 
-27.661 
-29.655 
-29.379 
-30.639 
-30.475 
-31.782 
-32.708 
-31.884 
-28.250 
-28.429 
-27.088 
-25.912 
-24.745 
-23.378 
-23.413 
-22 . 991 
-26.114 
-26.455 
-25.902 
-26.008 
-26.849 
-28.325 
-29. 129 
-28 . 814 
-30.073 
-24.587 
-23.938 
-24.103 
-22.752 
-21.766 



33.863 
34.517 
32.723 
31.747 
32.855 
31.824 
31.368 

30.755 
30.396 
32.315 
33.509 
31.401 
31. 788 
31.099 
29.691 
31.347 
31.410 
31.642 
30.826 
30 .414 
28 .885 
28 .206 
28.658 
27 . 881 
26 .429 
30.970 
31.492 
30.855 
31.333 
31.217 
31.723 
31.725 
32.463 
30.990 
30.493 
29.310 
31.115 
30.432 
31.421 
30.947 
30.787 
29.639 
29.724 
30.351 
28 .411 
27.608 
26.351 
26.602 
25.316 
24.407 
25.212 
27.201 
26.464 
27.693 
27.388 
28.291 



60.909 
59.552 
60.799 
60 . 085 
61.505 
61.536 
62.989 

63 .470 
63.095 
60.976 
61.013 
60.450 
59.922 
58.568 
58 . 763 
57.551 
60 . 928 
60.693 
62 . 048 
63 . 100 
63 . 188 
61. 987 
61.789 
60.670 
60. 987 
64.455 
64.608 
65.434 
66.785 
67.649 
69.076 
69.849 
69.450 
70.853 
67.378 
67 . 675 
67 .548 
68.077 
68.202 
68.716 
70.217 
68.046 
69.409 
70.412 
69.406 
70.618 
70.389 
70.159 
70.173 
69.374 
70.984 
70.977 
70.236 
72.109 
72.549 
71.809 



1.00 67.87 
1.00 50.62 
1.00 67.87 
1.00 67.87 
1.00 76.91 
1. 00 76.91 
1.00 83.13 

1.00 83.13 
1.00 83.13 
1 . 00 76 . 91 
1.00 76.91 
1.00 74.40 
1.00 74.40 
1.00 65.90 
1.00 65 . 90 
1.00 65 . 90 
1.00 74.40 
1.00 74.40 
1.00 78 . 25 
1.00 78.25 
1 . 00112 . 19 
1.00112.19 
1 . 00112 . 19 
1 . 00112 . 19 
1.00112.19 
1.00 78.25 
1.00 78.25 
1.00 99.21 
1.00 99.21 
1.00164.67 
1 . 00164 . 67 
1.00164 .67 
1.00164.67 
1.00164.67 
1.00 99.21 
1.00 99.21 
1.00 89.99 
1.00 89.99 
1.00 81.73 
1.00 81.73 
1.00 81.73 
1.00 81.73 
1.00 89.99 
1.00 89.99 
1 .00111 . 99 
1.00111.99 
1.00122.10 
1.00122.10 
1.00122 .10 
1.00122.10 
1.00122.10 
1 . 00111 . 99 
1.00111.99 
1.00 96.00 
1.00 96.00 
1.00 74.64 



FIGURE 25 CON'T 
Page 11 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



463 
464 
465 
466 
467 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
480 
481 
482 
483 
484 
485 
486 
487 
488 
489 
490 
491 
492 
493 
494 
495 
496 
497 
498 
499 
500 
501 
502 
503 
504 
505 
506 
507 
508 
509 
510 
511 
512 
513 
514 
515 
516 
517 



CG TYR 2 
CD1 TYR 2 
CE1 TYR 2 
CD 2 TYR 2 
CE2 TYR 1 
CZ TYR 2 
OH TYR 2 
C TYR 1 
O TYR 2 
N GLU 2 
CA GLU 1 



CD GLU 2 

OE1 GLU I 

0E2 GLU I 

C GLU ; 

O GLU I 

N GLY ; 

CA GLY 2 

C GLY 2 

O GLY 2 

N GLU ; 

CA GLU 2 

CB GLU 2 

CG GLU 2 

CD GLU 2 

OE1 GLU 2 

OE2 GLU 2 

C GLU 2 

O GLU i 

N PHE 1 

CA PHE 1 



CB PHE A 72 

CG PHE A 72 

CD1 PHE A 72 

CD2 PHE A 72 

CE1 PHE A 72 

CE2 PHE A 72 

CZ PHE A 72 



O PHE 2 
N ILE 2 
CA ILE 2 
CB ILE 2 
CG2 ILE 2 
CGI ILE 2 
CD1 ILE 2 
C ILE 2 
O ILE J 
N PRO 2 
CD PRO i 
CA PRO 1 
CB PRO 2 



CG 



PRO 2 
PRO 2 
PRO 2 



-20 .315 
-19.759 
-18 .412 
-19.488 
-18.145 
-17.612 
-16.278 
-22.659 
-22 . 832 
-22.390 
-22 .289 
-22.451 
-23 .712 
-23 .775 
-22 .836 
-24.766 
-20.970 
-20 .547 
-20.330 
-19.066 
-19.118 
-20 .160 
-17.989 
-17.910 
-16.650 
-15.407 
-15.054 
-15.702 
-14.158 
-17 . 920 
-17 .435 
-18 .477 
-18.537 
-19.192 
-20 .676 
-21.566 
-21.183 
-22.943 
-22.558 
-23.439 
-19.297 
-20.227 
-18 .895 
-19.556 
-18 .587 
-17 .337 
-19.303 
-18 .441 
-20.168 
-19.466 
-21.494 
-22.468 
-22.162 
-23 .640 
-23 .672 
-21.797 
-22.059 



26.800 
26.612 
29.168 
28.992 
27.716 
27.556 
27.595 
28 . 708 
26.507 
26.532 
25.111 
24.425 
22.957 
22.214 
22.546 
27.130 
26.902 
27 . 900 
28.527 
30.039 
30 . 612 
30.691 
32.144 
32.646 
31 . 759 
31 .412 
30 .496 
32.055 
32.650 
31.976 
33 . 842 
34.443 
33 .470 
33 .303 
34.245 
32.196 
34.084 
32.024 
32.969 
35.770 
35.973 
36.670 
37.958 
39.128 
38.967 
40 .444 
41.665 
38.133 
38.102 
38.312 
38.336 
38.492 
38.395 
38.979 
39.842 
40.884 



72.163 
72.202 
72.503 
72 .434 
72.736 
72.769 
73 .068 
74.054 
74 . 556 
74 . 767 
76.218 
76.756 
76.252 
76.623 
76.268 
77.263 
76.693 
77.825 
75.819 
76.163 
76. 041 
75.717 
76.301 
76.220 
76.933 
76. 786 
75.344 
74.782 
74.773 
74. 782 
73 . 873 
74.581 
73.250 
72.255 
72.442 
71.935 
73 . 115 
72 . 092 
73.280 
72 .767 
73 .246 
74 . 029 
72.357 
72 .248 
72 .494 
71.644 
72 .182 
72.295 
70.865 
69.854 
70.803 
71.909 
69.514 
69.879 
71.260 
68.912 
69.510 



1.00 74.64 
1.00 74.64 
1.00 74.64 
1.00 74.64 
1.00 74.64 
1.00 74.64 
1.00 74.64 
1.00 96.00 
1.00 96.00 
1.00113.97 
1 . 00113 . 97 
1.00132.01 
1 . 00132 . 01 
1.00132.01 
1 . 00132 . 01 
1.00132.01 
1 . 00113 . 97 
1 .00113 . 97 
1 .00103 . 38 
1.00103.38 
1 . 00103 . 38 
1 . 00103 . 38 
1.00 97.46 
1.00 97.46 
1 . 00122 . 18 
1.00122.18 
1.00122 . 18 
1.00122.18 
1.00122.18 
1.00 97 .46 
1.00 97.46 
1.00 79.72 
1.00 79.72 
1.00 65.28 
1.00 65.28 
1.00 65.28 
1.00 65.28 
1.00 65.28 
1.00 65.28 
1.00 65.28 
1.00 79.72 
1.00 79.72 
1.00131.12 
1.00131.12 
1.00107.34 
1.00107.34 
1.00107.34 
1.00107.34 
1.00131.12 
1.00131.12 
1.00110.61 
1.00106.24 
1.00110.61 
1.00106.24 
1.00106.24 
1.00110.61 
1.00110.61 



FIGURE 25 CON'T 
Page 12 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 

ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



518 
519 
520 
521 
522 
523 
524 
525 
526 
527 
528 
529 
530 
531 
532 
533 
534 
535 
536 
537 



540 
541 
542 
543 
544 
545 
546 
547 
548 
549 
550 
551 
552 
553 
554 
555 
556 
557 
558 
559 
560 
561 
562 
563 
564 
565 
566 
567 
558 
569 
570 
571 
572 
573 



N GLU I 

CA GLU t 

CB GLU J 

CG GLU i 

CD GLU ; 

OE1 GLU ; 

OE2 GLU 1 

C GLU ; 

O GLU ; 

N GLY 1 

CA GLY 1 

C GLY 1 

O GLY 1 

N GLU 1 

CA GLU 1 



CD GLU I 
OE1 GLU ? 
OE2 GLU t 
C GLU 7 
O GLU t 



ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
ILE A 
SER A 
SER A 
SER A 
SER A 



N PHE A 80 

CA PHE A 80 

CB PHE A 80 

CG PHE A 80 

CD1 PHE A 80 

CD 2 PHE A 8 0 

CE1 PHE A 8 0 

CE2 PHE A 80 

CZ PHE A 80 

C PHE A 80 

O PHE A 8 0 

N SER A 81 

CA SER A 81 

CB SER A 81 

OG SER A 81 

C SER A 81 

O SER A 81 

N GLU A 82 

CA GLU A 82 

CB GLU A 82 



-21.181 
-20.785 
-19.602 
-18.334 
-17.793 
-16.732 
-18.425 
-21.952 
-21.756 
-23 . 166 
-24.353 
-24.647 
-23 .866 
-25.783 
-26.191 
-27.705 
-28.511 
-30.007 
-30.777 
-30.410 
-25.492 
-24.963 

-25.497 
-24.861 
-23 .4C4 
-23.412 
-22.630 
-21.164 
-25.678 
-26.488 
-25.482 
-26.202 
-26.898 
-25.958 
-25.247 
-24.067 
-25.781 
-25.027 
-26.002 
-25.344 
-24.815 
-25.273 
-24.231 
-24.688 
-24.168 
-24 . 194 
-23 .107 
-24.705 
-24 . 022 
-24 . 933 
-25.289 
-22.696 
-21.824 
-22 .542 
-21 .321 
-21.650 



39.817 
41.044 
40.775 
40.329 
41.382 
41.139 
42.452 
41.616 
42 . 335 
41 .285 
41.777 
41.069 
40.236 
41 .405 
40.800 
40.910 
40.184 
40.235 
39.706 
40.803 
41.462 
42 .566 

40.777 
41.280 
40.750 
39.261 
41.506 
41 . 124 
40 . 788 
39. 872 
41.403 
40.985 
42.186 
43.180 
40.330 
40.686 
39.379 
38.632 
37.814 
36.777 
35.616 
36.946 
34.637 
35.970 
34.815 
39.534 
39.153 
40.733 
41.692 
42.895 
43.543 
42.197 
42.639 
42.139 
42.627 
43.141 



67.736 
67.054 
66.120 
66.830 
67.777 
68.391 
67.906 
66 . 257 
65.275 
66.687 
66.013 
64.707 
64.247 
64.106 
62 . 849 
62 . 671 
63.735 
63 .471 
64.303 
62.431 
61 . 679 
61.799 

60.542 
59.335 
59.232 
58.911 
58.153 
58.085 
58.142 
58 .274 
56.984 
55.794 
55.148 
54.782 
54.793 
54.708 
54.033 
53.038 
52.190 
51.327 
51.889 
49.948 
51.086 
49.137 
49.708 
52.134 
51.686 
51.870 
51.002 
50.733 
51.947 
51.553 
50 . 797 
52.869 
53 .497 
54.900 



1.00 87.29 
1.00 87.29 
1.00 79.91 
1 . 00 79. 91 
1.00 79.91 
1.00 79.91 
1.00 79.91 
1.00 87 . 29 
1.00 87.29 
1.00123.51 
1.00123 .51 
1 .00123 . 51 
1 . 00123 . 51 
1.00 80.94 
1.00 80 . 94 
1.00 91.89 
1.00 91 . 89 
1.00 91.89 
1.00 91.89 
1.00 91.89 
1.00 80.94 
1.00 80.94 

1.00 71.96 

1.00 71.96 

1.00 94.95 

1.00 94.95 

1.00 94.95 

1.00 94.95 

1.00 71.96 

1.00 71.96 

1.00 54.93 

1.00 54.93 

1.00 95.68 

1.00 95.68 

1.00 54.93 

1.00 54.93 

1.00 53.13 

1.00 53.13 

1.00 59.45 

1.00 59.45 

1.00 59.45 

1.00 59.45 

1.00 59.45 

1.00 59.45 

1.00 59.45 

1.00 53.13 

1.00 53.13 

1.00 50.44 

1.00 50.44 

1.00 64.56 

1.00 64.55 

1.00 50.44 

1.00 50.44 

1.00 66.85 

1.00 66.85 

1.00 83.28 
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ATOM 574 CG GLU A 

ATOM 575 CD GLU A 

ATOM 57 6 OE1 GLU A 

ATOM 577 OE2 GLU A 

ATOM 578 C GLU A 

ATOM 579 0 GLU A 

ATOM 580 N LEU A 

ATOM 581 CA LEU A 

ATOM 582 CB LEU A 

ATOM 5 83 CG LEU A 

ATOM 584 CD1 LEU A 

ATOM 585 CD 2 LEU A 

ATOM 586 C LEU A 

ATOM 587 0 LEU A 

ATOM 588 N ARG A 

ATOM 589 CA ARG A 

ATOM 590 CB ARG A 

ATOM 591 CG ARG A 

ATOM 5 92 CD ARG A 

ATOM 593 NE ARG A 

ATOM 594 CZ ARG A 

ATOM 5 95 NH1 ARG A 

ATOM 596 NH2 ARG A 

ATOM 597 C ARG A 

ATOM 598 O ARG A 

ATOM 599 N ASN A 

ATOM 600 CA ASN A 

ATOM 601 CB ASN A 

ATOM 602 CG ASN A 

ATOM 603 OD1 ASN A 

ATOM 604 ND2 ASN A 

ATOM 605 C ASN A 

ATOM 60 6 O ASN A 

ATOM 607 N ASP A 

ATOM 60 8 CA ASP A 

ATOM 60 9 CB ASP A 

ATOM 610 CG ASP A 

ATOM 611 OD1 ASP A 

ATOM 612 OD2 ASP A 

ATOM 613 C ASP A 

ATOM 614 O ASP A 

ATOM 615 N TYR A 

ATOM 616 CA TYR A 

ATOM 617 CB TYR A 

ATOM 618 CG TYR A 

ATOM 619 CD1 TYR A 

ATOM 62 0 CE1 TYR A 

ATOM 621 CD 2 TYR A 

ATOM 622 CE2 TYR A 

ATOM 623 CZ TYR A 

ATOM 624 OH TYR A 

■ ATOM 625 C TYR A 

ATOM 62 6 O TYR A 

ATOM 627 N GLN A 

ATOM 628 CA GLN A 

ATOM 629 CB GLN A 

ATOM 63 0 CG GLN A 



82 -22.525 44.394 

82 -23.218 44.684 

82 -24.087 43.882 

82 -22.894 45.717 

82 -20.178 41.609 

82 -19.022 41.982 

83 -20.492 40.330 
83 -19.459 39.300 
83 -20.093 37.912 
83 -20.801 37.640 
83 -21.641 36.383 
83 -19.766 37.523 
83 -18.647 39.357 

83 -19.192 39.626 

84 -17.342 39.125 
84 -16.496 39.126 
84 -15.017 39.007 
84 -14.064 39.372 
84 -12.663 39.752 
84 -11.862 38.615 
84 -11.893 38.111 
84 -12.691 38.639 
84 -11.118 37.081 
84 -16.955 37.915 

84 -17.531 36.981 

85 -16.717 37.937 
85 -17.153 36.853 
85 -16.558 37.052 
85 -17.033 38.333 
85 -18.185 38.734 
85 -16.155 38.979 
85 -16.853 35.443 

85 -17.744 34.586 

86 -15.600 35.217 
86 -15.141 33.930 
86 -13.758 34.093 
86 -12.837 34.954 
86 -12.539 34.564 
86 -12.415 36.025 
86 -16.105 33.384 

86 -16.581 32.258 

87 -16.401 34.193 
87 -17.301 33.773 
87 -17.201 34.751 
87 -15.788 34.829 
87 -15.143 33.686 
87 -13.813 33.724 
87 -15.068 36.021 
87 -13.735 36.071 
87 -13.115 34.917 
87 -11.794 34.949 
87 -18.740 33.620 

87 -19.474 32.761 

88 -19.147 34.440 
88 -20.513 34.337 
88 -20.767 35.371 
88 -20.763 36.823 



54.870 1.00 83.28 A 

56.189 1.00 83.28 A 

56.598 1.00 83.28 A 

56.815 1.00 83.28 A 

53.540 1.00 66.85 A 

53.731 1.00 66.85 A 

53.345 1.00 56.62 A 
53.358 1.00 56.62 A 
53.501 1.00 52.40 A 
54.828 1.00 52.40 A 
54.724 1.00 52.40 A 
55.931 1.00 52.40 A 
52.070 1.00 56.62 A 
50.995 1.00 56.62 A 
52.173 1.00 53.39 A 
50.988 1.00 53.39 A 
51.380 1.00 91.29 A 
50.250 1.00 91.29 A 
50.740 1.00 91.29 A 
51.197 1.00 91.29 A 
52.427 1.00 91.29 A 

53.346 1.00 91.29 A 
52.743 1.00 91.29 A 
50.172 1.00 53.39 A 
50.723 1.00 53.39 A 
48.867 1.00 57.59 A 
47.992 1.00 57.59 A 
46.603 1.00 68.01 A 
45.957 1.00 68.01 A 
46.129 1.00 68.01 A 
45.202 1.00 68.01 A 
48.495 1.00 57.59 A 
48.536 1.00 57.59 A 
48.875 1.00 55.76 A 
49.383 1.00 55.76 A 
50.011 1.00137.78 A 
49.169 1.00137.78 A 
48.021 1.00137.78 A 
49.555 1.00137.78 A 
50.422 1.00 55.76 A 
50.310 1.00 55.76 A 
51.433 1.00 48.09 A 
52.502 1.00 48.09 A 
53.673 1.00 85.93 A 
54.202 1.00 85.93 A 
54.677 1.00 85.93 A 
55.088 1.00 85.93 A 
54.162 1.00 85.93 A 
54.574 1.00 85.93 A 

55.033 1.00 85.93 A 
55.420 1.00 85.93 A 

52.034 1.00 48.09 A 
52 . 541 1 . 00 48 . 09 A 
51.063 1.00 44.08 A 
50.540 1.00 44.08 A 
49.431 1.00 54.63 A 
49.883 1.00 54.63 A 
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ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 

ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



631 
632 
633 
634 
635 
636 



CD GI>N A 
OE1 GLN A 
NE2 GLN A 
C GLN A 
O GLN A 
N SER A 



637 CA SER A 
63 8 CB SER A 
63 9 OG SER A 



640 
641 
642 
643 
644 
645 
646 
647 
648 
649 
650 
651 
652 
653 
654 
655 
656 
657 
658 
659 
660 
661 
662 
663 
664 
665 
666 
667 

668 
669 
670 



C SER A 

O SER A 

N LYS A 

CA LYS A 

CB LYS A 

CG LYS A 

CD LYS A 

CE LYS A 

NZ LYS A 

C LYS A 

O LYS A 

N LEU A 

CA LEU A 

CB LEU A 

CG LEU A 

CD1 LEU A 

CD 2 LEU A 

C LEU A 

O LEU A 

N VAL A 

CA VAL A 

CB VAL A 

CGI VAL A 

CG2 VAL A 

C VAL A 

O VAL A 

N LEU A 

CA LEU A 

CB LEU A 
CG LEU A 
CD1 LEU A 



93 

671 CD 2 LEU A 93 



672 
673 
674 
675 
676 
677 
678 
679 
680 
681 
682 
683 



C LEU A 93 

O LEU A 93 

N ARG A 94 

CA ARG A 94 

CB ARG A 94 

CG ARG A 94 

CD ARG A 94 

NE ARG A 94 

CZ ARG A 94 

NH1 ARG A 94 

NH2 ARG A 94 

C ARG A 94 

O ARG A 94 

N LEU A 95 

CA LEU A 95 



-21.411 
-21.115 
-22.296 
-20.681 
-21.627 
-19.733 
-19.754 
-18 .612 
-18 .808 
-19.658 
-20.340 
-18 .816 
-18 .669 
-17.549 
-16.155 
-15.095 
-13 . 702 
-12.660 
-19.973 
-20.401 
-20.617 
-21.879 
-22.335 
-23 .594 
-23 .430 
-23.846 
-22.950 
-23 .776 
-22 .927 
-23 .894 
-23 .832 
-24.704 
-24 .298 
-23 .628 
-24.561 
-22.354 
-22 .000 

-20.479 
-19.956 
-20 .456 
-18.417 
-22.652 
-23.301 
-22.493 
-23 .095 
-22.714 
-23 .425 
-22.934 
-23 .634 
-24.781 
-25.377 
-25 .325 
-24.620 
-25 .248 
-25.217 
-26.672 



37.755 
37.694 
38 .626 
32 . 928 
32.229 
32 .524 
31.209 
31.098 
31.996 
30.090 
29. 067 
30.287 
29.292 
29.686 
29.471 
29. 962 
29.668 
30.310 
29. 144 
28.027 
30 .267 
30.218 
31.632 
31.719 
30.848 
33 . 173 
29.506 
28.769 
29.714 
29. 072 
29.683 
28.879 
31.134 
27.567 
26.757 
27.190 
25.775 

25.603 
24.161 
23.449 
24.163 
25.106 
24.061 
25.718 
25. 183 
26.037 
25.620 
24.260 
23 .785 
23.110 
22.822 
22.706 
25.131 
24.167 
26.166 
26.204 



48.863 
47.674 
49. 335 
49. 966 
50.290 
49. 122 
48 . 494 
47.491 
46.409 
49.528 
49.419 
50.537 
51.588 
52.554 
51 . 999 
52 . 966 
52.446 
53 .293 
52.343 
52.641 

52 .649 

53 .369 

53 .735 

54 .610 
55.854 
54.999 
52.533 
53.074 
51.217 
50.322 
48 .888 
47.931 
48.921 
50.254 
50.226 
50.235 
50 .211 

50.264 
50.241 
48.998 
50.265 
51.424 
51.297 
52.598 
53.822 
55.036 
56.318 
56.798 
57.990 
57.977 
56.827 
59.119 
53.710 
54.155 
53.119 
52.958 



1.00 54 . 63 
1.00 54 . 63 
1.00 54.63 
1.00 44 . 08 
1.00 44 . 08 
1.00 48.72 
1.00 48.72 
1.00 56.82 
1.00 56.82 
1.00 48.72 
1.00 48.72 
1.00 50.16 
1.00 50.16 
1.00 67.64 
1.00 67.64 
1.00 67.64 
1.00 67.64 
1.00 67.64 
1.00 50.16 
1.00 50.16 
1.00 45.26 
1.00 45.26 
1.00 57.37 
1.00 57.37 
1.00 57.37 
1.00 57.37 
1.00 45.26 
1.00 45.26 
1.00 42.08 
1.00 42.08 
1.00 46.56 
1.00 46.56 
1.00 46.56 
1.00 42.08 
1.00 42.08 
1.00 48.30 
1.00 48.30 

1.00 40.14 
1.00 40.14 
1.00 40.14 
1.00 40.14 
1.00 48.30 
1.00 48.30 
1.00 49.80 
1.00 49.80 
1.00 56.97 
1.00 56.97 
1.00 56.97 
1.00 56.97 
1.00 56.97 
1.00 56.97 
1.00 56.97 
1.00 49.80 
1.00 49.80 
1.00 58.97 
1 . 00 58 . 97 
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ATOM 687 CB LEU A 95 -27.128 27.587 52.468 1.00 59.80 

ATOM 688 CG LEU A 95 -27.061 28.699 53.525 1.00 59.80 

ATOM 689 CD 1 LEU A 95 -27.396 30.043 52.905 1.00 59.80 

ATOM 690 CD2 LEU A 95 -28.026 28.375 54.661 1.00 59.80 

ATOM 691 C LEU A 95 -27.161 25.116 52.005 1.00 58.97 

ATOM 692 O LEU A 95 -28.223 24.530 52.218 1.00 58.97 

ATOM 693 N LEU A 96 -26.398 24.848 50.947 1.00 52.25 

ATOM 694 CA LEU A 96 -26.777 23.788 50.012 1.00 52.25 

ATOM 695 CB LEU A 96 -25.732 23.649 48.893 1.00 39.29 

ATOM 696 CG LEU A 96 -25.307 24.714 47.790 1.00 39.29 

ATOM 697 CD1 LEU A 96 -24.683 24.540 46.806 1.00 39.29 

ATOM 698 CD2 LEU A 96 -27.155 24.593 47.086 1.00 39.29 

ATOM 699 C LEU A 96 -26.894 22.465 50.772 1.00 52.25 

ATOM 700 O LEU A 96 -27.829 21.690 50.556 1.00 52.25 

ATOM 701 N LYS A 97 -25.942 22.219 51.667 1.00 61.38 

ATOM 702 CA LYS A 97 -25.927 20.994 52.452 1.00 61.38 

ATOM 703 CB LYS A 97 -24.729 20.994 53.412 1.00 64.44 

ATOM 704 CG LYS A 97 -24.551 19.703 54.207 1.00 64.44 

ATOM 705 CD LYS A 97 -23.899 18.619 53.361 1.00 64.44 

ATOM 706 CE LYS A 97 -23.566 17.382 54.184 1.00 64.44 

ATOM 707 NZ LYS A 97 -24.768 16.670 54.680 1.00 64.44 

ATOM 708 C LYS A 97 -27.211 20.852 53.275 1.00 61.38 

ATOM 709 O LYS A 97 -27.734 19.747 53.434 1.00 61.38 

ATOM 710 N GLU A 98 -27.720 21.968 53.791 1.00 62.33 

ATOM 711 CA GLU A 98 -28.944 21.932 54.584 1.00 62.33 

ATOM 712 CB GLU A 98 -29.188 23.278 55.271 1.00 83.72 

ATOM 713 CG GLU A 98 -28.308 23.498 56.484 1.00 83.72 

ATOM 714 CD GLU A 98 -28.430 22.368 57.498 1.00 83.72 

ATOM 715 OE1 GLU A 98 -29.536 22.178 58.048 1.00 83.72 

ATOM 716 OE2 GLU A 98 -27.422 21.667 57.740 1.00 83.72 

ATOM 717 C GLU A 98 -30.163 21.539 53.767 1.00 62.33 

ATOM 718 0 GLU A 98 -31.200 21.186 54.327 1.00 62.33 

ATOM 719 N ASN A 99 -30.043 21.599 52.445 1.00 61.59 

ATOM 720 CA ASN A 99 -31.153 21.224 51.589 1.00 61.59 

ATOM 721 CB ASN A 99 -31.361 22.255 50.480 1.00 72.80 

ATOM 722 CG ASN A 99 -31.899 23.573 51.007 1.00 72.80 

ATOM 723 OD1 ASN A 99 -31.174 24.347 51.536 1.00 72.80 

ATOM 724 ND2 ASN A 99 -33.182 23.826 50.765 1.00 72.80 

ATOM 725 C ASN A 99 -30.902 19.857 50.983 1.00 61.59 

ATOM 726 O ASN A 99 -31.506 19.500 49.972 1.00 61.59 

ATOM 727 N GLY A 100 -30.002 19.100 51.607 1.00 63.74 

ATOM 728 CA GLY A 100 -29.684 17.768 51.126 1.00 63.74 

ATOM 729 C GLY A 100 -28.749 17.717 49.932 1.00 63.74 

ATOM 730 O GLY A 100 -28.641 16.687 49.269 1.00 63.74 

ATOM 731 N ILE A 101 -28. .071 18.821 49.642 1.00 49.44 

ATOM 732 CA ILE A 101 -27.148 18.844 48.515 1.00 49.44 

ATOM 733 CB ILE A 101 -27.453 20.023 47.575 1.00 51.42 

ATOM 734 CG2 ILE A 101 -26.437 20.066 46.432 1.00 51.42 

ATOM 735 CGI ILE A 101 -28.873 19.871 47.022 1.00 51.42 

ATOM 736 CD1 ILE A 101 -29.328 21.045 46.203 1.00 51.42 

ATOM 737 C ILE A 101 -25.722 18.951 49.035 1.00 49.44 

ATOM 738 O ILE A 101 -25.212 20.049 49.255 1.00 49.44 

ATOM 739 N GLY A 102 -25.097 17.795 49.245 1.00 43.56 

ATOM 740 CA GLY A 102 -23.734 17.758 49.744 1.00 43.56 

ATOM 741 C GLY A 102 -22.955 16.629 49.100 1.00 43.56 

ATOM 742 O GLY A 102 -23.513 15.823 48.349 1.00 43.56 

ATOM 743 N GLU A 103 -21.665 16.560 49.391 1.00 42.81 

FIGURE 25 CON'T 
Page 16 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 744 CA GLU A 103 -20.827 15.517 48.819 1.00 42.81 

ATOM 745 CB GLU A 103 -19.385 15.702 49.282 1.00 48.36 

ATOM 746 CG GLU A 103 -18.764 16.951 48.688 1.00 48.36 

ATOM 747 CD GLU A 103 -17.347 17.165 49.132 1.00 48.36 

ATOM 748 OE1 GLU A 103 -16.671 18.048 48.569 1.00 48.36 

ATOM 749 OE2 GLU A 103 -16.905 16.451 50.050 1.00 48.36 

ATOM 750 C GLU A 103 -21.321 14.123 49.161 1.00 42.81 

ATOM 751 O GLU A 103 -21.320 13.234 48.311 1.00 42.81 

ATOM 752 N TYR A 104 -21.751 13.935 50.404 1.00 45.75 

ATOM 753 CA TYR A 104 -22.253 12.642 50.838 1.00 45.75 

ATOM 754 CB TYR A 104 -22.697 12.707 52.300 1.00 64.47 

ATOM 755 CG TYR A 104 -23.400 11.453 52.767 1.00 64.47 

ATOM 756 CD1 TYR A 104 -22.714 10.245 52.870 1.00 64.47 

ATOM 757 CE1 TYR A 104 -23.357 9.083 53.292 1.00 64.47 

ATOM 758 CD2 TYR A 104 -24.754 11.472 53.097 1.00 64.47 

ATOM 759 CE2 TYR A 104 -25.409 10.317 53.522 1.00 64.47 

ATOM 760 CZ TYR A 104 -24.704 9.124 53.618 1.00 64.47 

ATOM 761 OH TYR A 104 -25.341 7.977 54.049 1.00 64.47 

ATOM 762 C TYR A 104 -23.431 12.202 49.972 1.00 45.75 

ATOM 763 O TYR A 104 -23.380 11.156 49.327 1.0045.75 

ATOM 764 N GLU A 105 -24.490 13.007 49.955 1.0043.71 

ATOM 765 CA GLU A 105 -25.683 12.683 49.172 1.00 43.71 

ATOM 766 CB GLU A 10 5 -26.764 13.751 49.376 1.0062.74 

ATOM 767 CG GLU A 105 -27.287 13.911 50.803 1.00 62.74 

ATOM 768 CD GLU A 105 -26.311 14.619 51.723 1.0062.74 

ATOM 769 0E1 GLU A 105 -25.458 15.386 51.225 1.00 62.74 

ATOM 770 OE2 GLU A 105 -26.407 14.421 52.952 1.00 62.74 

ATOM 771 C GLU A 105 -25.431 12.526 47.662 1.00 43.71 

ATOM 772 0 GLU A 105 -25.856 11.544 47.064 1.00 43.71 

ATOM 773 N LEU A 106 -24.754 13.490 47.040 1.00 39.82 

ATOM 774 CA LEU A 106 -24.504 13.402 45.606 1.00 39.82 

ATOM 775 CB LEU A 10 6 -24.109 14.779 45.042 1.00 40.67 

ATOM 776 CG LEU A 106 -25.330 15.671 44.722 1.00 40.67 

ATOM 777 CD1 LEU A 106 -26.094 16.030 45.998 1.00 40.67 

ATOM 778 CD2 LEU A 106 -24.872 16.930 44.023 1.00 40.67 

ATOM 779 C LEU A 10 6 -23.489 12.325 45.204 1.00 39.82 

ATOM 780 O LEU A 106 -23.542 11.822 44.082 1.00 39.82 

ATOM 781 N SER A 107 -22.570 11.966 46.103 1.00 38.34 

ATOM 782 CA SER A 107 -21.611 10.905 45.794 1.00 38.34 

ATOM 783 CB SER A 107 -20.526 10.800 46.870 1.00 39.34 

ATOM 784 OG SER A 107 -19.582 11.852 46.752 1.00 39.34 

ATOM 785 C SER A 107 -22.385 9.589 45.706 1.00 38.34 

ATOM 786 O SER A 107 -22.087 8.730 44.872 1.00 38.34 

ATOM 787 N LYS A 108 -23.388 9.436 46.568 1.00 46.95 

ATOM 788 CA LYS A 108 -24.212 8.231 46.549 1.00 46.95 

ATOM 789 CB LYS A 108 -25.175 8.218 47.740 1.00 70.68 

ATOM 790 CG LYS A 108 -24.480 7.977 49.075 1.00 70.68 

ATOM 791 CD LYS A 108 -25.396 8.254 50.258 1.00 70.68 

ATOM 792 CE LYS A 108 -26.610 7.350 50.247 1.00 70.68 

ATOM 793 NZ LYS A 108 -27.445 7.570 51.455 1.00 70.68 

ATOM 794 C LYS A 108 -24.990 8.184 45.235 1.00 46.95 

ATOM 795 O LYS A 108 -25.078 7.141 44.600 1.00 46.95 

ATOM 796 N LEU A 109 -25.549 9.319 44.825 1.00 42.18 

ATOM 797 CA LEU A 109 -26.291 9.376 43.575 1.00 42.18 

ATOM 798 CB LEU A 109 -26.873 10.783 43.358 1.00 40.49 

ATOM 799 CG LEU A 109 -27.945 11.260 44.354 1.00 40.49 
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ATOM 800 CD1 LEU A 109 -28.387 12.689 44.012 1.00 40.49 

ATOM 801 CD2 LEU A 109 -29.150 10.316 44.331 1.00 40.49 

ATOM 802 C LEU A 109 -25.373 8.983 42.410 1.0042.18 

ATOM 803 O LEU A 109 -25.772 8.217 41.539 1.00 42.18 

ATOM 804 N LEU A 110 -24.146 9.499 42.395 1.00 37.22 

ATOM 805 CA LEU A 110 -23.194 9.163 41.329 1.00 37.22 

ATOM 806 CB LEU A 110 -21.843 9.869 41.557 1.00 28.86 

ATOM 807 CG LEU A 110 -20.639 9.415 40.701 1.00 28.86 

ATOM 808 CD1 LEU A 110 -20.876 9.791 39.218 1.00 28.86 

ATOM 809 CD2 LEU A 110 -19.329 10.081 41.214 1.00 28.86 

ATOM 810 C LEU A 110 -22.943 7.658 41.215 1.00 37.22 

ATOM 811 O LEU A 110 -22.899 7.101 40.101 1.00 37.22 

ATOM 812 N ARG A 111 -22.761 7.001 42.356 1.00 45.48 

ATOM 813 CA ARG A 111 -22.495 5.569 42.353 1.00 45.48 

ATOM 814 CB ARG A 111 -21.988 5.111 43.730 1.00 39.33 

ATOM 815 CG ARG A 111 -20.594 5.692 44.031 1.00 39.33 

ATOM 816 CD ARG A 111 -19.960 5.096 45.270 1.00 39.33 

ATOM 817 NE ARG A 111 -20.584 5.586 45.496 1.00 39.33 

ATOM 818 CZ ARG A 111 -20.157 6.633 47.196 1.00 39.33 

ATOM 819 NH1 ARG A 111 -19.089 7.316 46.794 1.00 39.33 

ATOM 820 NH2 ARG A 111 -20.799 6.990 48.305 1.00 39.33 

ATOM 821 C ARG A 111 -23.685 4.737 41.891 1.00 45.48 

ATOM 822 O ARG A 111 -23.585 3.516 41.753 1.00 45.48 

ATOM 823 N LYS A 112 -24.808 5.397 41.629 1.00 49.69 

ATOM 824 CA LYS A 112 -25.963 4.676 41.122 1.00 49.69 

ATOM 825 CB LYS A 112 -27.249 5.481 41.321 1.00 66.26 

ATOM 826 CG LYS A 112 -27.793 5.461 42.733 1.00 66.26 

ATOM 827 CD LYS A 112 -29.187 6.068 42.771 1.00 56.26 

ATOM 828 CE LYS A 112 -29.856 5.845 44.120 1.00 66.26 

ATOM 829 NZ LYS A 112 -31.276 6.297 44.113 1.00 56.25 

ATOM 830 C LYS A 112 -25.729 4.455 39.629 1.00 49.69 

ATOM 831 O LYS A 112 -26.286 3.531 39.032 1.00 49.69 

ATOM 832 N PHE A 113 -24.881 5.301 39.045 1.00 44.64 

ATOM 833 CA PHE A 113 -24.569 5.248 37.614 1.00 44.64 

ATOM 834 CB PHE A 113 -24.845 6.617 36.996 1.00 42.87 

ATOM 835 CG PHE A 113 -26.215 7.152 37.323 1.00 42.87 

ATOM 836 CD1 PHE A 113 -27.331 6.729 36.607 1.00 42.87 

ATOM 837 CD2 PHE A 113 -26.397 8.034 38.386 1.00 42.87 

ATOM 838 CE1 PHE A- 113 -28.608 7.171 36.948 1.00 42.87 

ATOM 839 CE2 PHE A 113 -27.675 8.483 38.735 1.00 42.87 

ATOM 840 CZ PHE A 113 -28.782 8.047 38.014 1.0042.87 

ATOM 841 C PHE A 113 -23.133 4.827 37.313 1.0044.64 

ATOM 842 O PHE A 113 -22.881 4.157 36.309 1.00 44.64 

ATOM 843 N ARG A 114 -22.198 5.243 38.169 1.00 42.32 

ATOM 844 CA ARG A 114 -20.784 4.899 38.022 1.00 42.32 

ATOM 845 CB ARG A 114 -19.933 6.167 37.868 1.00 50.54 

ATOM 846 CG ARG A 114 -18.470 5.926 37.474 1.00 50.54 

ATOM 847 CD ARG A 114 -18.357 5.175 36.149 1.00 50.54 

ATOM 848 NE ARG A 114 -17.008 5.231 35.590 1.00 50.54 

ATOM 849 CZ ARG A 114 -16.596 4.539 34.529 1.00 50.54 

ATOM 850 NH1 ARG A 114 -17.430 3.723 33.896 1.00 50.54 

ATOM 851 NH2 ARG A 114 -15.344 4.648 34.106 1.00 50.54 

ATOM 852 C ARG A 114 -20.428 4.167 39.316 1.00 42.32 

ATOM 853 O ARG A 114 -20.106 4.785 40.339 1.00 42.32 

ATOM 854 N LYS A 115 -20.496 2.844 39.260 1.00 44.60 

ATOM • 855 CA LYS A 115 -20.248 2.008 40.433 1.00 44.60 

ATOM 856 CB LYS A 115 -20.946 0.657 40.255 1.00 77.56 
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ATOM 857 CG LYS A 115 -22.445 0.743 40.062 1.00 77. 5S 

ATOM 858 CD LYS A 115 -23.035 -0.639 39.853 1.00 77.56 

ATOM 859 CE LYS A 115 -24.543 -0.582 39.674 1.00 77.56 

ATOM 860 NZ LYS A 115 -25.133 -1.952 39.542 1.00 77.56 

ATOM 861 C LYS A 115 -18.795 1.747 40.795 1.00 44.60 

ATOM 862 O LYS A 115 -17.912 1.822 39.954 1.00 44.60 

ATOM 863 N PRO A 116 -18.534 1.449 42.076 1.00 45.82 

ATOM 864 CD PRO A 116 -19.441 1.560 43.227 1.00 41.36 

ATOM 865 CA PRO A 116 -17.172 1.161 42.521 1.00 45.82 

ATOM 866 CB PRO A 116 -17.343 0.883 44.013 1.00 41.36 

ATOM 867 CG PRO A 116 -18.457 1.773 44.387 1.00 41.36 

ATOM 868 C PRO A 116 -16.751 -0.094 41.772 1.00 45.82 

ATOM 869 O PRO A 116 -17.578 -0.970 41.504 1.00 45.82 

ATOM 870 N LYS A 117 -15.474 -0.194 41.442 1.00 44.22 

ATOM 871 CA LYS A 117 -15.002 -1.361 40.724 1.00 44.22 

ATOM 872 CB LYS A 117 -14.766 -1.003 39.258 1.00 53.13- 

ATOM 873 CG LYS A 117 -14.467 -2.185 38.357 1.00 53.13 

ATOM 874 CD LYS A 117 -14.588 -1.775 36.896 1.00 53.13 

ATOM 875 CE LYS A 117 -14.312 -2.934 35.946 1.00 53.13 

ATOM 876 NZ LYS A 117 -14.623 -2.553 34.526 1.00 53.13 

ATOM 877 C LYS A 117 -13.712 -1.820 41.380 1.00 44.22 

ATOM 878 O LYS A 117 -12.877 -1.004 41.766 1.00 44.22 

ATOM 879 N THRA118 -13.553 -3.131 41.509 1.00 52.99 

ATOM 880 CA THR A 118 -12.363 -3.672 42.135 1.0052.99 

ATOM 881 CB THR A 118 -12.730 -4.747 43.168 1.00 74.27 

ATOM 882 OG1 THR A 118 -13.546 -5.744 42.546 1.00 74.27 

ATOM 883 CG2 THR A 118 -13.496 -4.128 44.325 1.00 74.27 

ATOM 884 C THR A 118 -11.394 -4.260 41.130 1.00 52.99 

ATOM 885 O THR A 118 -11.786 -4.960 40.201 1.00 52.99 

ATOM 886 N PHEA119 -10.123 -3.941 41.319 1.00 43.70 

ATOM 887 CA PHE A 119 -9.043 -4.434 40.469 1.00 43.70 

ATOM 888 CB PHE A 119 -8.372 -3.274 39.714 1.00 50.21 

ATOM 889 CG PHE A 119 -9.294 -2.541 38.763 1.00 50.21 

ATOM 890 CD1 PHE A 119 -9.266 -2.811 37.393 1.00 50.21 

ATOM 891 CD 2 PHE A 119 -10.199 -1.597 39.239 1.00 50.21 

ATOM 892 CE1 PHE A 119 -10.137 -2.142 36.508 1.00 50.21 

ATOM 893 CE2 PHE A 119 -11.067 -0.931 38.371 1.00 50.21 

ATOM 894 CZ PHE A 119 -11.037 -1.202 37.003 1.00 50.21 

ATOM 895 C PHE A 119 -8.055 -5.046 41.454 1.00 43.70 

ATOM 896 O PHE A 119 -7.467 -4.328 42.280 1.00 43.70 

ATOM 897 N GLY A 120 -7.883 -6.364 41.393 1.00 44.94 

ATOM 898 CA GLY A 120 -6.966 -7.019 42.312 1.0044.94 

ATOM 899 C GLY A 120 -7.471 -6.796 43.723 1.00 44.94 

ATOM 900 O GLY A 12 0 -8.652 -7.033 44.002 1.0044.94 

ATOM 901 N ASP A 121 -6.598 -6.326 44.610 1.00 55.66 

ATOM 902 CA ASP A 121 -6.980 -6.071 45.999 1.00 55.66 

ATOM 903 CB ASP A 121 -5.795 -6.302 46.950 1.00 64.03 

ATOM 904 CG ASP A 121 -5.403 -7.762 47.070 1.00 64.03 

ATOM 905 OD1 ASP A 121 -6.305 -8.622 47.160 1.00 64.03 

ATOM 906 OD2 ASP A 121 -4.186 -8.043 47.095 1.00 64.03 

ATOM 907 C ASP A 121 -7.467 -4.643 46.214 1.00 55.66 

ATOM 908 O ASP A 121 -7.723 -4.242 47.350 1.00 55.66 

ATOM 909 N TYR A 122 -7.598 -3.874 45.140 1.00 49.08 

ATOM 910 CA TYR A 122 -8.014 -2.485 45.276 1.00 49.08 

ATOM 911 CB TYR A 122 -6.972 -1.577 44.618 1.00 51.10 

ATOM 912 CG TYR A 122 -5.632 -1.582 45.312 1.00 51.10 

ATOM 913 CD1 TYR A 122 -5.347 -0.674 46.343 1.00 51.10 
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940 
941 
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953 
954 
955 
956 
957 
958 
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960 
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962 
963 
964 
965 
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CE1 TYR A 122 

CD 2 TYR A 122 

CE2 TYR A 122 

CZ TYR A 122 

OH TYR A 122 

C TYR A 122 

O TYR A 122 

N LYS A 123 

CA LYS A 123 

CB LYS A 12 3 

CG LYS A 123 

CD LYS A 123 

CE LYS A 12 3 

NZ LYS A 123 

C LYS A 123 

O LYS A 123 

N VAL A 124 

CA VAL A 124 

CB VAL A 124 

CGI VAL A 124 

CG2 VAL A 124 

C VAL A 124 

O VAL A 124 

N ILE A 125 

CA ILE A 125 

CB ILE A 125 

CG2 ILE A 125 

CGI ILE A 125 

CD1 ILE A 125 

C ILE A 125 

O ILE A 125 

N PRO A 126 

CD PRO A 126 

CA PRO A 126 

CB PRO A 12 6 

CG PRO A 126 

C PRO A 126 

O PRO A 126 

N SER A 127 

CA SER A 127 

CB SER A 127 

OG SER A 12 7 

C SER A 127 

O SER A 127 

N VAL A 128 

CA VAL A 128 

CB VAL A 12 8 
CGI VAL A 12 8 
CG2 VAL A 12 8 

C VAL A 128 

O VAL A 12 8 

N GLU A 129 

CA GLU A 12 9 

CB GLU A 129 

CG GLU A 12 9 

CD GLU A 12 9 



-4.136 
-4.665 
-3 .443 
-3.186 
-1.991 
-9.396 
-9.815 
-10.110 
-11 .421 
-12 .487 

-13 .869 
-14.896 
-15.216 
-16.267 
-11.369 
-10.870 
-11.858 
-11.918 
-11.800 
-11.982 
-10.447 
-13 .290 
-14.317 
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-14.553 
-14.541 
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-14 . 110 
-14.856 
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-16.561 
-16.225 
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-16 . 993 
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-16.782 
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-15.702 
-17.448 
-16.843 
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-18 . 025 
-19.399 
-19.995 
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-0.713 
-2.526 
-2.577 
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-1.748 
-2.162 
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-0.875 
-1.060 

-0.625 
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0.597 
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3 .837 
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4 .436 
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3 . 068 
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6.798 
5.949 
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7.269 
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5.492 
8 . 071 
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II. 086 
11. 664 
11.848 
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13 . 989 
14.532 
13.633 
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14.923 
15.898 
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48 . 154 
43.986 
44.327 
43.122 
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41.020 
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44.519 
45.912 
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43 .898 
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44.524 
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45.388 
44.909 



1.00 51.10 
1.00 51.10 
1.00 51.10 
1.00 51.10 
1.00 51.10 
1.00 49.08 
1.00 49.08 
1.00 46.45 
1.00 46.45 
1.00 59.84 

1.00 59.84 
1.00 59.84 
1.00 59.84 
1.00 59.84 
1.00 46.45 
1.00 46.45 
1.00 43.32 
1.00 43.32 
1.00 44 . 39 
1.00 44.39 
1.00 44 .39 
1.00 43 .32 
1.00 43 .32 
1.00 34.73 
1.00 34.73 
1.00 39.71 
1.00 39.71 
1.00 39.71 
1. 00 39.71 
1.00 34.73 
1.00 34.73 
1.00 42.89 
1.00 37.78 
1.00 42.89 
1.00 37.78 
1 . 00 37 . 78 
1 . 00 42 . 89 
1 . 00 42 . 89 
1.00 41.13 
1.00 41.13 
1.00 36.83 
1.00 36.83 
1.00 41.13 
1. 00 41.13 
1.00 41.04 
1.00 41.04 
1.00 42.52 
1.00 42.52 
1.00 42.52 
1.00 41.04 
1.00 41.04 
1.00 39.23 
1.00 39.23 
1.00 59.10 
1.00 59.10 
1.00 59.10 



FIGURE 25 CON'T 
Page 20 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



990 
991 
992 
993 
994 
995 
996 
997 
998 
999 
1000 
1001 
1002 
1003 
1004 
1005 
1006 
1007 
1008 
1009 
1010 
1011 
1012 
1013 
1014 
1015 
1016 
1017 
1018 
1019 
1020 
1021 
1022 
1023 
1024 
1025 
1026 



OE1 GLU A 12 9 
OE2 GLU A 12 9 
C GLU A 12 9 
O GLU A 129 
N MSE A 13 0 
CA MSE A 13 0 
CB MSE A 13 0 
CG MSE A 13 0 
SE MSE A 13 0 
CE MSE A 13 0 
C MSE A 130 
O MSE A 130 
N SER A 131 
CA SER A 131 
CB SER A 131 
OG SER A 131 
C SER A 131 
O SER A 131 
N VAL A 132 
CA VAL A 132 
CB VAL A 132 
CGI VAL A 132 
CG2 VAL A 132 
C VAL A 132 
O VAL A 132 
N ILE A 133 
CA ILE A 133 
CB ILE A 133 
CG2 ILE A 133 
CGI ILE A 133 
CD1 ILE A 133 
C ILE A 133 
0 ILE A 133 
N LYS A 134 
CA LYS A 134 
CB LYS A 134 
CG LYS A 134 
CD LYS A 134 
CE LYS A 13 4 
NZ LYS A 134 
C LYS A 134 
0 LYS A 134 
N HIS A 135 
CA HIS A 135 
CB HIS A 135 
CG HIS A 135 
CD2 HIS -A 135 
ND1 HIS A 135 
CE1 HIS A 135 
NE2 HIS A 135 
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O HIS A 135 
N ASP A 136 
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-16.909 
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-18 .217 
-19.645 
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ATOM 3685 ND2 ASN A 4 99 32.191 12.540 37.346 1.00 51.72 

ATOM 3686 C ASN A 499 28.264 10.967 35.000 1.00 41.29 

ATOM 3687 0 ASN A 499 28.703 11.312 33.902 1.00 41.29 

ATOM 3688 N VAL A 500 27.303 10.062 35.131 1.00 42.36 

ATOM 3689 CA VAL A 500 26.696 9.463 33.952 1.00 42.36 

ATOM 3690 CB VAL A 500 26.783 7.916 34.003 1.00 38.46 

ATOM 3691 CGI VAL A 500 26.070 7.305 32.786 1.00 38.46 

ATOM 3692 CG2 VAL A 500 28.259 7.483 34.017 1.00 38.46 

ATOM 3693 C VAL A 500 25.238 9.891 33.848 1.00 42.36 

ATOM 3694 O VAL A 500 24.434 9.628 34.753 1.00 42.36 

ATOM 3695 N ILE A 501 24.909 10.572 32.755 1.00 36.97 

ATOM 3696 CA ILE A 501 23.545 11.027 32.521 1.00 36.97 

ATOM 3697 CB ILE A 501 23.498 12.270 31.587 1.00 40.00 

ATOM 3698 CG2 ILE A 501 22.047 12.747 31.416 1.00 40.00 

ATOM 3699 CGI ILE A 501 24.383 13.393 32.140 1.00 40.00 

ATOM 3700 CD1 ILE A 501 24.030 13.826 33.503 1.00 40.00 

ATOM 3701 C ILE A 501 22.888 9.855 31.803 1.00 36.97 

ATOM 3702 O ILE A 501 23.525 9.217 30.965 1.00 36.97 

ATOM 3703 N SER A 502 21.632 9.563 32.113 1.00 35.35 

ATOM 3704 CA SER A 502 20.978 8.435 31.463 1.00 35.35 

ATOM 3705 CB SER A 502 20.817 7.293 32.465 1.00 45.93 

ATOM 3706 OG SER A 502 20.206 7.765 33.644 1.00 45.93 

ATOM 3707 C SER A 502 19.630 8.732 30.796 1.00 35.35 

ATOM 3708 O SER A 502 18.951 9.706 31.115 1.00 35.35 

ATOM 3709 N GLN A 503 19.257 7.866 29.862 1.00 37.71 

ATOM 3710 CA GLN A 503 18.008 8.006 29.124 1.00 37.71 

ATOM 3711 CB GLN A 503 18.284 8.537 27.709 1.00 48.11 

ATOM 3712 CG GLN A 503 17.046 8.679 26.840 1.00 48.11 

ATOM 3713 CD GLN A 503 16.065 9.680 27.410 1.00 48.11 

ATOM 3714 OE1 GLN A 503 16.450 10.786 27.792 1.00 48.11 

ATOM 3715 NE2 GLN A 503 14.790 9.302 27.469 1.00 48.11 

ATOM 3716 C GLN A 503 17.381 6.616 29.042 1.00 37.71 

ATOM 3717 O GLN A 503 17.978 5. 580 28.494 1.00 37.71 

ATOM 3718 N VAL A 504 16.182 6.493 29.587 1.00 44.85 

ATOM 3719 CA VAL A 504 15.480 5.225 29.600 1.00 44.85 

ATOM 3720 CB VAL A 504 14.784 4.983 30.960 1.00 45.98 

ATOM 3721 CGI VAL A 504 14.143 3.604 30.968 1.00 45.98 

ATOM 3722 CG2 VAL A 504 15.777 5.126 32.099 1.00 45.98 

ATOM 3723 C VAL A 504 14.400 5.124 2B.536 1.00 44.85 

ATOM 3724 O VAL A 504 13.630 6.056 28.332 1.00 44.85 

ATOM 3725 N VAL A 505 14.360 3.983 27.861 1.00 43.58 

ATOM 3726 CA VAL A 505 13.337 3.685 26.865 1.00 43.58 

ATOM 3727 CB VAL A 505 13.942 3.523 25.458 1.00 38.53 

ATOM 3728 CGI VAL A 505 12.844 3.259 24.454 1.00 38.53 

ATOM 3729 CG2 VAL A 505 14.695 4.777 25.068 1.00 38.53 

ATOM 3730 C VAL A 505 12.761 2.349 27.355 1.00 43.58 

ATOM 3731 O VAL A 505 13.493 1.363 27.490 1.00 43.58 

ATOM 3732 N ASN A 506 11.468 2.318 27.664 1.00 46.42 

ATOM 3733 CA ASN A 506 10.859 1.085 28.159 1.00 46.42 

ATOM 3734 CB ASN A 506 9.706 1.386 29.129 1.00 59.81 

ATOM 3735 CG ASN A 506 8.601 2.214 28.493 1.00 59.81 

ATOM 3736 OD1 ASN A 506 8.264 2.030 27.326 1.00 59.81 

ATOM 3737 ND2 ASN A 506 8.022 3.122 29.269 1.00 59.81 

ATOM 3738 C ASN A 506 10.363 0.172 27.048 1.00 46.42 

ATOM 3739 O ASN A 506 10.230 0.587 25.902 1.00 46.42 

ATOM 3740 N GLU A 507 10.095 -1.077 27.404 1.00 50.99 
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1.00 44.51 
1.00 44.51 
1.00 44.51 
1.00 44.51 
1.00 37.82 
1.00 37.82 
1.00 38.39 
1.00 38.39 
1.00 34 . 83 
1.00 34.83 
1.00 34.83 
1.00 34.83 
1.00 38.39 
1.00 38.39 
1.00 32.98 



FIGURE 25 CON'T 
Page 107 of 111 



WO 2006/015258 



PCT/US2005/027084 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
ATOM 



5891 
5892 
5893 
5894 
5895 
5896 
5897 
5898 
5899 
5900 
5901 
5902 
5903 
5904 
5905 
5906 
5907 
5908 
5909 
5910 
5911 
5912 
5913 
5914 
5915 



CA TYR A 768 
CB TYR A 768 
CG TYR A 768 
CD1 TYR A 768 
CE1 TYR A 768 
CD 2 TYR A 76 8 
CE2 TYR A 768 
CZ TYR A 768 
OH TYR A 768 
C TYR A 768 
O TYR A 768 
N PHE A 769 
CA PHE A 769 
CB PHE A 769 
CG PHE A 769 
CD1 PHE A 769 
CD 2 PHE A 769 
CE1 PHE A 769 
CE2 PHE A 769 
CZ PHE A 76 9 
C PHE A 769 
O PHE A 769 
N VAL A 77 0 
CA VAL A 77 0 
CB VAL A 770 



20.016 
21.479 
21.718 
21.427 
21.680 
22.266 
22.527 
22.233 
22.488 
19.206 
19.243 
18 .486 
17.663 
17.633 
17.232 
15.885 
18 .194 
15.505 
17.819 
16 .471 
16 .224 
15 .406 
15 . 923 
14 .588 
14 .200 



15 . 655 
15.628 
16.219 
15.499 
16.036 
17.492 
18 . 034 
17 . 310 
17.874 
14 . 799 
13 . 579 
15.443 
14 . 769 
15 . 633 
17 . 052 
17.417 
17.997 
18 . 686 
19 . 269 
19 . 616 
14.553 
13.941 
15.068 
14.955 
16.229 



34.494 
34.948 
36.318 
37.479 
38.742 
36.454 
37.705 
38.845 
40.081 
35.440 
35.338 
36.352 
37.349 
38.604 
38.325 
38.284 
37.962 
37.881 
37.556 
37.513 
36.851 
37.540 
35.664 
35.092 
34.315 



1.00 32 . 98 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 37.63 
1.00 32.98 
1.00 32.98 
1.00 42.84 
1.00 42.84 
1.00 31.31 
1.00 31.31 
1.00 31.31 
1.00 31.31 
1.00 31.31 
1.00 31.31 
1.00 31.31 
1.00 42.84 
1.00 42.84 
1.00 74.51 
1.00 74.51 
1.00 61.33 



5916 
5917 
5918 
5919 
5920 



ATOM 
ATOM 
ATOM 
ATOM 
ATOM 
TER 

HETATM 5921 
HETATM 5922 
HETATM 5923 
HETATM 5924 
HETATM 5925 
HETATM 592 6 
HETATM 592 7 
HETATM 5928 
HETATM 5929 
HETATM 5930 
HETATM 5931 
HETATM 5932 
HETATM 5933 
HETATM 5934 
HETATM 5935 
HETATM 593 6 
HETATM 593 7 
HETATM 5938 
HETATM 5939 
HETATM 5940 
HETATM 5941 
HETATM 5942 
HETATM 5943 
HETATM 5944 
HETATM 5945 



CGI VAL A 770 
CG2 VAL A 770 
C VAL A 770 
O VAL A 770 
OXT VAL A 770 

OH2 TIP 1 

0H2 TIP 2 

OH2 TIP 3 

0H2 TIP 4 

OH2 TIP 5 

OH2 TIP 6 

OH2 TIP 7 

OH2 TIP 8 

OH2 TIP 9 

OH2 TIP 10 

0H2 TIP 11 

OH2 TIP 12 

OH2 TIP 13 

OH2 TIP 14 

OH2 TIP 15 

OH2 TIP 16 

OH2 TIP 17 

OH2 TIP 18 

OH2 TIP 19 

OH2 TIP 20 

OH2 TIP 21 

OH2 TIP 22 

OH2 TIP 23 

OH2 TIP 24 

OH2 TIP 25 



12 .799 
14 .301 
14 .477 
14 . 621 
14 .260 

18 .370 
21.899 
17 .715 
-2 .476 
-14 .987 
-12 .088 
-13 .937 
-11 .412 
-0.863 
-5.288 
2 .392 
-0 .342 
-9.647 
-10.301 
20 .555 
28 . 742 
-13 .195 
-11.704 
-3 .276 
-6.804 
-18.325 
-16.649 
-16.485 
-18 . 046 
9.520 



16.097 
17.436 
13 . 785 
14 . 04 5 
12 . 54 0 

21.810 

9. 099 
23.135 
34.389 

6.046 
26.903 
25.878 
13 .515 
24.163 

1.461 
30.592 

0.215 
15.151 
15.778 
11.259 
20.816 
30.515 
20.891 

9.798 

0.216 
20.403 
26.665 
13.068 

8.967 
-9.191 



33 .781 
35.202 
34.132 
32.917 
34.588 

29.334 
35.356 
26. 861 
46.433 
36.915 
41.491 
43.376 
38.333 
41.344 
64.019 
41.077 
47.316 
37. 001 
34.084 
34.330 
42.795 
39.458 
31.112 
47.450 
75.536 
49.558 
47.872 
25.192 
49.052 
28.429 



1.00 61.33 
1. 00 61.33 
1.00 74.51 
1.00 74.51 
1.00 61.33 

1.00 44.97 
1.00 40.25 
1.00 45.74 
1.00 57.89 
1 . 00 53 .30 
1. 00 46.69 
1.00 45.40 
1. 00 43 .19 
1.00 53.53 
1.00 48.92 
1.00 38.99 
1.00 46.86 
1.00 45.57 
1.00 43.77 
1.00 45.20 
1.00 44.99 
1.00 52.48 
1.00 42.42 
1.00 49.11 
1 . 00 47 . 88 
1.00 50.84 
1.00 47.40 
1.00 47.13 
1.00 49.94 
1.00 58.22 



FIGURE 25 CON'T 
„ Page 108 of lU 



WO 2006/015258 



PCT/US2005/027084 



HETATM 594 6 
HETATM 5947 
HETATM 594 8 
HETATM 594 9 
HETATM 5950 
HETATM 5951 
HETATM 5952 
HETATM 5953 
HETATM 5954 
HETATM 5955 
HETATM 5956 
HETATM 5957 
HETATM 5958 
HETATM 595 9 
HETATM 5960 
HETATM 5961 
HETATM 5962 
HETATM 5963 
HETATM 5964 
HETATM 5965 
HETATM 5966 
HETATM 5967 
HETATM 5968 
HETATM 5969 
HETATM 5970 
HETATM 5971 
HETATM 5972 
HETATM 5973 
HETATM 5974 
HETATM 5975 
HETATM 5976 
HETATM 5977 
HETATM 597 8 
HETATM 5979 
HETATM 598 0 
HETATM 5981 
HETATM 5982 
HETATM 5983 
HETATM 5984 
HETATM 5985 
HETATM 5986 
HETATM 5987 
HETATM 5988 
HETATM 598 9 
HETATM 5990 
HETATM 5991 
HETATM 5992 
HETATM 5993 
HETATM 5994 
HETATM 5995 
HETATM 5996 
HETATM 5997 
HETATM 5998 
HETATM 5999 
HETATM 60 00 
HETATM 6001 
HETATM 60 02 



OH2 TIP 26 

OH2 TIP 2 7 

OH2 TIP 2 8 

OH2 TIP 2 9 

OH2 TIP 3 0 

OH2 TIP 31 

OH2 TIP 32 

OH2 TIP 3 3 

OH2 TIP 34 

OH2 TIP 35 

OH2 TIP 3 6 

OH2 TIP 3 7 

OH2 TIP 3 8 

OH2 TIP 3 9 

OH2 TIP 40 

OH2 TIP 41 

OH2 TIP 42 

OH2 TIP 43 

OH2 TIP 44 

OH2 TIP 45 

OH2 TIP 46 

OH2 TIP 47 

OH2 TIP 48 

OH2 TIP 4 9 

OH2 TIP 50 

OH2 TIP 51 

OH2 TIP 52 

OH2 TIP 53 

OH2 TIP 54 

0H2 TIP 55 

OH2 TIP 56 

OH2 TIP 57 

OH2 TIP 58 

OH2 TIP 59 

OH2 TIP 60 

OH2 TIP 61 

OH2 TIP 62 

OH2 TIP 63 

OH2 TIP 64 

OH2 TIP 65 

OH2 TIP 66 

OH2 TIP 67 

OH2 TIP 68 

OH2 TIP 69 

OH2 TIP 70 

OH2 TIP 71 

OH2 TIP 72 

OH2 TIP 73 

OH2 TIP 74 

OH2 TIP 75 

OH2 TIP 76 

OH2 TIP 77 

OH2 TIP 78 

OH2 TIP 79 

OH2 TIP 80 

OH2 TIP 81 

OH2 TIP 82 



16.540 25.632 

3.118 37.550 

10.045 -3.931 

15.202 8.781 

-26.959 25.672 

10.791 -9.014 

-2.669 23.392 

-19.309 21.014 

-9.112 11.965 

-26.009 29.088 

-5.553 4.212 

11.338 20.488 

-9.515 16.282 

2.189 8.728 

29.893 20.655 

-8.168 36.508 

-16.396 37.700 

-17.803 38.142 

6.251 7.179 

-7.728 22.337 

6.036 24.040 

-10.838 11.111 

9.902 4.698 

24.470 32.001 

21.439 27.832 

-23.212 4.269 

8.541 4.628 

-16.370 13.072 

9.141 18.845 

-21.814 2.461 

-28.157 10.364 

-27.342 32.420 

-15.983 27.483 

-16.252 16.477 

16.220 10.260 

9.900 7.126 

-2.545 14.634 

-25.196 4.635 

-18.872 -1.352 

2.709 -12.794 

-11.260 -6.365 

-15.786 37.573 

17.880 22.397 

-21.502 15.816 

5.325 -0.467 

11.117 22.339 

23.110 11.363 

21.863 -9.662 

20.547 46.701 

33.405 9.009 

-7.459 19.056 

-16.279 32.467 

20.859 37.282 

-9.099 5.768 

17.074 48.531 

-20.591 1.700 

28.187 29.184 



27.261 1.00 

32.058 1.00 

55.270 1.00 

31.435 1.00 

35.586 1.00 

26.464 1.00 

42.935 1.00 

25.827 1.00 

44.438 1.00 
33.535 1.00 
66.054 1.00 

23.113 1.00 
40.382 1.00 
53.259 1.00 
40.220 1.00 
40.351 1.00 
39.050 1.00 
41.395 1.00 
47.328 1.00 
20.181 1.00 
46.517 1.00 
57.417 1.00 
27.131 1.00 
48.942 1.00 
48.102 1.00 
47.464 1.00 
44.863 1.00 
47.759 1.00 
46.330 1.00 

46.114 1.00 
48.240 1.00 

42.439 1.00 
57.654 1.00 
36.469 1.00 

4.698 1.00 

48.174 1.00 

63.859 1.00 

45.535 1.00 

47.305 1.00 

61.990 1.00 

37.836 1.00 

36.314 1.00 

16.014 1.00 

52.789 1.00 

16.696 1.00 

24.819 1.00 

42.088 1.00 

36.508 1.00 

40.738 1,00 

38.372 1.00 

39.490 1.00 

44.564 1.00 

41.728 1.00 

64.065 1.00 

37.844 1.00 

36.204 1.00 

34.781 1.00 



47.57 S 
54.40 S 

51.11 S 
49.91 S 
48.67 S 
59.29 S 
57.01 S 
48.36 S 
56.44 S 

53.75 S 
69.21 S 
57.25 S 
53.65 S 
51.62 S 
61.64 S 

56.55 S 
57.28 S 
62.91 S 
50.89 S 
51.49 S 

57.58 S 
72.00 S 
54.85 S 
64.93 S 
55.84 S 
52.89 S 

68.12 S 
51.09 S 
55.16 S 
61.78 S 

53.11 S 
51.52 S 

55.12 S 

22.88 S 
70.64 S 
69.93 S 
66.97 S 
51.71 S 
70.97 S 
71.93 S 

73.76 S 
74.43 S 
54.49 S 
56.42 S 
60.24 S 

61.70 S 

46.89 S 
47.82 S 
72.58 S 
64.06 S 

53.56 S 

54.71 S 
53.34 S 
53.69 S 
57.71 S 
54.39 S 
68.56 S 
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HETATM 6003 
HETATM 6004 
HETATM 600 5 
HETATM 600 6 
HETATM 6007 
HETATM 6008 
HETATM 6009 
HETATM 6010 
HETATM 6011 
HETATM 6012 
HETATM 6013 
HETATM 6014 
HETATM 6015 
HETATM 6016 
HETATM 6017 
HETATM 6018 
HETATM 6019 
HETATM 602 0 
HETATM 6021 
HETATM 6022 
HETATM 602 3 
HETATM 6024 
HETATM 6025 
HETATM 602 6 
HETATM 602 7 
HETATM 6028 
HETATM 6029 
HETATM 603 0 
HETATM 6031 
HETATM 6032 
HETATM 6033 
HETATM 6034 
HETATM 603 5 
HETATM 603 6 
HETATM 603 7 
HETATM 603 8 
HETATM 603 9 
HETATM 604 0 
HETATM 6041 
HETATM 6042 
HETATM 6043 
HETATM 6044 
HETATM 6045 
HETATM 6046 
HETATM 6047 
HETATM 6048 
HETATM 6049 
HETATM 6050 
HETATM 6051 
HETATM 6052 
HETATM 6053 
HETATM 6054 
HETATM 6055 
HETATM 6056 
HETATM 6057 
HETATM 6058 
HETATM 5059 



OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OII2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 



100 
101 
102 
103 
104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 
122 
123 
124 
125 
126 
127 
128 
129 
130 
131 
132 
133 
134 
135 
136 
137 
138 
139 



17.032 
22.262 
21.630 

-14 .521 
-5.498 

-14.952 
-9.574 

-19.934 

-19.722 
-5.179 
23.200 
14 .477 
5.075 
3.783 
4.878 

-28 .304 



36.198 
8.106 
15.662 
1.762 
-7.282 
27.909 
11.072 
26 . 786 
25.179 
4 . 991 
29.231 
10 . 996 
38 .959 
42 .153 
29 . 768 
35.106 
-6.647 -10.395 



-21.038 
34.025 
36.250 

-19.532 

-28.578 

-31.547 
25.271 

-25.496 
30.641 

-10 .470 
-1.801 
6.173 
20.958 
22 . 035 

-30 .415 
5.151 

-12.603 
27.059 
-3 .826 
-4.725 

-18.107 
17.151 
8.014 
-3.700 
0.364 

-24 .577 
22 .736 
7.925 
0.340 
-3 .152 

-31.945 
16 .495 
-9.277 
-1.692 

-15.239 
9.945 
26 .089 
-1.929 
13.583 
-7.307 



41.643 
6.004 
4 .424 
-2 .215 
38 .820 
35 .211 
22 .710 
28 .500 
-4.118 
12 .483 
12 . 520 
25 . 074 
-10 . 184 
25.421 
16 . 042 
24. Ill 
7 . 93 0 
-10.195 
-5.899 
20 . 658 
11. 820 
11.758 
33 . 927 
-4.366 
19.226 
17.454 
-8 .506 
-9.335 
27.055 
41.692 
32.910 
37.467 
38 .782 
32.663 
34.373 
15.897 
32.855 
26.327 
11.648 
-6.874 



26.598 
13 .250 
15.725 
54 . 240 
59.486 
45.106 
36.645 
19.690 
57 . 795 
79.116 
31.435 
37.967 
28.482 
55. 054 
52 . 759 
42.428 
70.918 
47.045 
38 . 783 
29. 906 
42.785 
54.834 
67.941 
46.374 
31.122 
25.389 
33 .234 
50.029 
43.601 
22.202 
55.991 
33 . 842 
49.267 
52.673 
35.988 
43 .779 
29 . 307 
49.900 
34.731 
55.929 
41.096 
44.509 
58.370 
25.854 
64.313 
48.265 
27.224 
40.837 
29.440 
34.372 
53.014 
42.234 
33 .414 
33.020 
46.767 
31.478 
73.320 



1.00 67.39 
1.00 55.06 
1.00 70.89 
1.00 56.21 
1.00 64.11 
1.00 55.41 
1.00 60 . 66 
1.00 61.56 
1.00 61.53 
1.00 59.85 
1.00 56.39 
1.00 62.57 
1.00 60.07 
1.00 83 .33 
1.00 52.19 
1.00 68.52 
1.00 54.01 
1.00 67.28 
1.00 54 . 10 
1.00 61.46 
1.00 68.84 
1.00 61 . 91 
1.00 72.99 
1.00 68.08 
1.00 51 . 94 
1.00 61.56 
1.00 29.98 
1.00 74.36 
1.00 49.94 
1.00 80.32 
1.00 64.77 
1.00 57.83 
1.00 58.42 
1.00 59.27 
1.00 57.86 
1.00 51.76 
1.00 62.77 
1.00 68.29 
1.00 55.47 
1.00 71.42 
1. 00 77.33 
1.00 61.07' 
1.00 86.24 
1.00 55.67 
1.00 77.83 
1.00 63.16 
1.00 52.94 
1.00 57.78 
1.00 60.34 
1.00 67.29 
1.00 71.19 
1.00 64.11 
1.00 70.18 
1.00 73.49 
1.00 64.01 
1.00 60.15 
1.00 63.01 
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HETATM 606 0 
HETATM 6061 
HETATM 6062 
HETATM 6063 
HETATM 6064 
HETATM 6065 
HETATM 6066 
HETATM 6067 
HETATM 6068 
HETATM 6069 
HETATM 607 0 
HETATM 6071 
HETATM 6072 
HETATM 6073 
HETATM 6074 
HETATM 6075 
HETATM 6076 
HETATM 6077 
HETATM 6078 
HETATM 607 9 
HETATM 608 0 
HETATM 6081 
HETATM 6082 
HI^TM 6083 
HETATM 6084 
HETATM 6085 
HETATM 608 6 
HETATM 6087 
HETATM 6088 
HETATM 6089 
HETATM 6090 
HETATM 6091 
HETATM 6092 
HETATM 6093 
HETATM 6094 
HETATM 6095 
HETATM 6096 
HETATM 6097 
HETATM 6098 
HETATM 6099 
HETATM 6100 
HETATM 6101 
HETATM 6102 
HETATM 6103 
HETATM 6104 
HETATM 6105 
HETATM 610 6 
HETATM 6107 
HETATM 6108 
HETATM 6109 
HETATM 6110 
HETATM 6111 
HETATM 6112 



OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 
OH2 TIP 



140 
141 
142 
143 
144 
145 
146 
147 
148 
14 9 
150 
151 
152 
153 
154 
155 
156 
157 
158 
159 
160 
161 
162 
163 
164 
165 
166 
167 
168 
169 
170 
171 
172 
173 
174 
175 
176 
177 
178 
179 
180 
181 
182 
183 
184 
185 
186 
187 
188 
189 
ISO 
191 
192 



9.685 
-9.735 
13.163 
26.022 
-27.679 
14.352 
2.168 
17.351 
22 .741 
25.467 
-8 .133 
39.496 
-27 .004 
11.578 
-20 .336 
13 .097 
5.383 
0.299 
18.732 
11.519 
5.770 
-19.600 
12.071 
28.289 
-18.857 
-2 . 972 
5.019 
-4.242 
6.092 
-9 . 930 
27.665 
-26.013 
8 .244 
-19.875 
11.462 
-24.510 
9.210 
-18 .628 
1.594 
-35.932 
6.610 
38 .280 
23 .197 
0.495 
-9.2 93 
-5.859 
-33.933 
5.875 
2.542 
6.354 
10.534 
-19.033 
15.573 



-0 .304 
19.821 
34.923 
30.090 

1.275 
31.000 
46.394 

9.774 
27 . 792 
17 . 052 
12 .899 

12 .986 
20 . 006 
47.500 
32 .675 

■10 . 641 
2 . 741 
-0 . 391 
30.891 
16.830 

13 . 582 
23.78S 
48.060 
31.551 
40 .608 

6.632 
21.239 
■12 . 300 

5 . 418 
-16 . 166 
32.448 
31.633 

2.428 
32 . 340 
-2 . 190 

4.225 
19.021 
25 . 940 
35.321 
32 . 773 

0.656 

4 . 139 
32 . 109 
11.928 
15.487 
38 . 883 
18.831 
12.627 
10.779 
22 . 715 
12.785 

3.122 

5.668 



44.722 
43 .417 
21.459 
31.402 
40.061 
21.838 
49.629 
33.057 
29.446 
46.437 
35.603 
32 . 142 
31.382 
43.124 
28.580 
36.782 
40.628 
44.871 
19.910 
31.002 
62.799 
55.119 
40.615 
33.446 
48.467 
69.087 
11.590 
64.349 
11.062 
70 . 042 
46.853 
34.322 
72.981 
20.985 
5.568 
34.154 
23 .310 
53 .835 
17.143 
43.500 
77.404 
33.513 
31.628 
51.178 
45.977 
39.260 
40.589 
58.910 
55.129 
29.312 
55.689 
60.501 
55.157 



1.00 73 . 65 
1.00 71.29 
1.00 68.38 
1 . 00 68 . 11 
1 . 00 63 . 18 
1.00 55.20 
1.00 60.35 
1.00 69.92 
1.00 67.41 
1.00 51.85 
1.00 51.72 
1.00 62 . 69 
1.00 55.79 
1.00 55.79 
1.00 55.17 
1.00 68.86 
1.00 76.91 
1.00 75.15 
1.00 57.45 
1.00 66.16 
1.00 90.10 
1.00 63.82 
1.00 68.92 
1.00 58.67 
1.00 70.70 
1.00 71.62 
1.00 74.98 
1.00 74.41 
1.00 58.26 
1.00 77.12 
1.00 85.11 
1.00 63.66 
1.00 72.20 
1.00 65.25 
1.00 75.60 
1.00 65.70 
1.00 53 . 96 
1.00 55.14 
1.00 66.35 
1.00 65.83 
1.00 68.12 
1.00 76.88 
1.00 65.86 
1.00 76.94 
1.00 77.80 
1.00 53.48 
1.00 72.18 
1.00 79.98 
1.00 69.62 
1.00 83.68 
1.00 68.13 
1.00 71.79 
1.00 82.05 
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